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ABSTRACT 


Techniques are presented for experimentally computing 
emserete-time model equations from a finite set of sampled 
observations of the system inputs and outputs. Existing 
modeling techniques typically consider simple model forms, 
and often make limiting assumptions and simplifications for 
mathematical convenience. This research extends these 
techniques to efficiently obtain a more accurate model 
equation. Four Key points are examined: (1) form of the 
model equation, (2) choice of the error minimization 
technique, (3) efficiency of model determination and 
evaluation algorithms, and (4) interpretation of the 
obtained model equations in typical applications. 

A new algorithm for efficient model determination, the 
Search Indicator Growth Algorithm, is presented. This 
iterative algorithm efficiently evaluates a set of model 
terms and eliminates the undesired terms. The technique 
produces more accurate and robust model equations, and 
offers significant computational advantages over existing 


techniques. Computer simulated experiments illustrate the 


effectiveness of this method. 
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LT. WTRODUCTION 


A. PRESENTATION OF THE RESEARCH PROBLEM 

This research examines the problem of experimentally 
developing discrete-time model equations to represent, or 
approximate, the input-output behavior of both linear and 
nonlinear systems based on a finite set of sampled 
observations of the system inputs and outputs. 

The traditional approach to the modeling problem 
involves selecting a particular model form, estimating the 
Unknown coefficients of the model from the observations, and 
finally verifying the quality of the model. The particular 
model form 1s commonly chosen for mathematical convenience 
or from some physical understanding of the structure of the 
system. Given a specific model equation to represent a 
system, a number of Pee nineeneis exist for estimating the 
matwes of the model coefficients that minimize a function of 
the fitting error between the model and the system. 

We are interested in developing the techniques needed to 
obtain a suitable model when the underlying structure of the 
system is unknown. With a suitable model, a variety of 
current applications can be approached, including the 
detection and evaluation of failures that affect system 
performance. The problem is how to obtain a useful model 


from the available observations of the system. 
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twee are four key parts of this modeling problem. 
First, the allowed functional forms for the model equation 
must be sufficiently general to permit adequate 
approximation of the behavior of systems of interest without 
requiring an unmanageable number of terms. (The adequacy of 
an approximation is an application dependent consideration 
and will be discussed later.) For most nonlinear systems of 
interest, the discrete-time model form typically used in the 
literature is the Volterra series model. This form does not 
adequately satisfy the accuracy or compact representation 
requirements except for avery restricted class of systems. 

The second Key part is the determination of the best 
error minimization technique for uSe in evaluating any model 
equation. This becomes an area of concern in terms of both 
eceuracy and computational efficiency when we are faced with 
finite length data sequences and measurement noise. 

The third key part is the development of a general 
technique for evolving or "growing" a model equation in a 
computationally efficient and accurate manner. Existing 
techniques for approaching this part of the problem are very 
limited as to the functional model form they can handle, and 
often make somewhat artificial assumptions and 
approximations to obtain even a partial solution. The 
Peomece is typically an inferior model of the system, with 
insufficient prediction accuracy or an excessive number of 


terms in the model equation. 





The fourth part is determining if the obtained model is 
ome “Cesc wodel, in some sense, for use in a particular 
aeplireation. This last point involves an investigation of 
whether any obtained model equation is a preferred 
representation of the system, or just one model from a set 
Or “furetionaily equivalent models. We examine each of these 
four areas in this research and extend the existing 
techniques for developing model equations. 

Most researchers have approached modeling from the 
coefficient estimation perspective, and there is a large 
body of literature on techniques for efficiently estimating 
the values of the coefficients of specific models once the 
model form has been chosen. Levinson [(Ref. 1], Durbin 
[Re@t. 2), Robinson (Ref. 3], and Morf (Ref. 4 and 5] have 
developed computationally efficient "recursive-in-order" 
algorithms for iteratively estimating the coefficient values 
of eeevain linear models. Recently, Lee (Ref. 6], 
Priedlander (Ref. 7], Perry (Ref. 8], and Parker and Perry 
(Ref. 9] have reported on extensions of these algorithms 
that estimate the coefficient values of a wider class of 
model forms. These techniques are shown to be inadequate 
for the more general problem of an unknown system form. 

We approach the modeling problem from a different 
perspective, that of systematically growing a model that 
minimizes the error residual signal (performance modeling), 


peenwer Shan just estimating the coefficients of an arbitrary 
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model form. We have shifted our concern to performance 
modeling since we are really interested in the behavior of 
the system, and not the values of the coefficients of one 
Sarciciear approximation of it. This may seem at first to 
be a relatively minor difference in approach, but the 
development in the following chapters has uncovered some new 


capabilities for more efficient model determination. 


B. DISCUSSION 

Modern systems are both dynamic and complex, yet they 
generally work by cause and effect, i.e. the set of inputs 
operating on the system produces a set of outputs. 
Knowledge of this relationship between input and output, 
enables us to approach various practical applications. We 
Gan ~ppredict reactiog based on the action applied, control 
the output by modification of the input, adapt the 
Celevioral cWaracteristics so that a given stimulus will 
result in a desired response, diagnose the cause by 
observing the effect, and detect and evaluate changes 
(failures) in a system's performance. 

For failure detection applications, it 1S necesSary to 
have some standard or reference by which to make the 
@@termination that a failure has occurred. An often used 
concept provides redundancy in the form of one or more 
additional systems (or subsystems) operating in parallel 
with the original system, compares the various outputs, and 
UH66 “An appropriate criterion to detect a failed system. 


ES, 





It is not feasible to have redundant systems in many 
applications so a Simulation is used, such as a mathematical 
model that approximates the system's performance 
characteristics. In some cases, this model can be designed 
from a detailed knowledge of both the structure and 
components of the system {Ref. 10]. Because extensive 
detail is typically required for creating this type of 
model, we refer to this as "microscopic modeling". But 
there is often insufficient knowledge of the system 
Structure and/or component behavior of real world systems, 
Or the computational complexity is excessive, and an 
alternate modeling technique is needed. 

One concept used by earlier researchers [{Ref. 11 and 12] 
selects a specific model form and uses it to characterizZe, 
Or approximate, the performance of the system from input and 
output measurements of the system. This concept is referred 
to as "input-output" or "macroscopic" modeling, and the 
approximation must be done in some meaningful sense if it is 
to be useful. The choice of this mathematical form 
determines both the quality of the model (extent to which it 
approximates the system behavior) and the meaning of the 
model coefficient estimates. If the mathematical model is 
ome exmeet Or equivalent representation of the System, there 
is a correspondence between the set of system parameters and 
Che seteofemedel coefficients. Various properties of the 


model and the error residual (difference between the system 
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ame realized model outputs under identical input conditions) 
Game be used in applications including failure detection and 
evaluation. Dheweharaetieristics of the measured input 
Signal, and of any meaSurement noise, has a direct effect on 
the model performancer. 

macroscopic characterization also involves model 
building (growing), which adds terms to a given model 
equation to better fit the observed data. it awmodel fit is 
not adequate for an application, then the standard technique 
in the literature (Ref. 11] guesses at a "better" model and 
fits to it the Same data. This "brute-force" technique 
makes little use of the specific features of the 
unacceptable model, and blindly continues until (hopefully) 
an adequate fit eCREEUHSE 

This technique makes some physical and practical sense 
when dealing with a Simple model form corresponding to the 
known form of the system. Examples include linear transfer 
functions (ARMA models) and static polynomials. In these 
cases, each succesSively larger model provides a better fit 


fee ne Precedqing fit can be considered as a reduced order 


1 Two interesting casesS regarding the input Signal are; 
(1) we have little or no control over the characteristics of 
ene Vaeut Sienal, and (2) the input signal (or probe) is 
under our complete control for the system characterization 
process. For the work that follows, we consider case (2) 
with the assumption that the input meaSurements are 
representative of the normal system operation. Sece1 OM ee eo 
Chapter VI examines both of these situations. The impact of 
measurement noise is also addressed in Chapter III. 


1, 





model of the system. This technique is of dubious value 
when the form of the system is unknown. We need to develop 
alternative growth techniques that are more useful in the 
general case. 

A particular recursive algorithm was introduced by 
Levinson (Ref. 1], and adapted by Durbin (Ref. 2], to 
efficiently obtain the solution of a specially structured 
set of linear equations. Dosusin cee Chiswalgorithm, a @rucial 
Simplification was often made by earlier researchers [Ref. 
4-9, 13 and 14). By limiting the form of the model and 
assuming that the input sampled data sequence is ergodic, a 
special mathematical structure can be induced into the model 
evaluation equations, and exploited to save a significant 
amount of mathematical computation. This ergodic assumption 
has been rationalized from different points of view, and has 
resulted in various related evaluation techniques. The 
Simplification will be examined in detail in Chapter III, 
Since its true effects do not appear clearly in the 
literature. While the simplification appears reasonable in 
the isolated context in which it was made, it will be shown 
that the net effect of model evaluation using this 
Simplif@Peation is typically that the obtained model is 
suboptimal in prediction performance. 

Even without these assumptions, the recursive solution 
algorithm has limited usefulness. While feasible for 


@yewreat linear model forms, it is shown that the recursive 





algorithm rapidly becomes computationally prohibitive when 
considering general nonlinear models. Alternative model 
growth techniques are therefore needed. 

It is often overlooked that from the same input 
Sequence, different system equations can give the same 
Output sequence. The result is indistinguishable 
performance characteristics from structurally different 
systems, or equivalently, multiple different (and often 
independent) model equations each adequately describing the 
performance characteristics of a single system. 

This polnt will be diseussed further because it has 
implications for the fault detection and evaluation 
application. Using the integer n as the discrete time 
index, the system input sequence is denoted as {u(n)}, the 
output sequence as {y(n)}, and the residual sequence due to 
inaccuracies in the model equation as f{e(n)}. AsSume a 
suitable model has been obtained and is uSed in conjunction 
with the real system as shown below. 


Input sequence {u(n)} Output Sequence {y(n)} 









Ss Ol 






Residual Sequence {e(n)} 







MODEL 





Figure 1: Configuration for Fault Detection 
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A se@enificant change #n the characteristics of the error 
residual sequence {e(n)} could be used to indicate that 
there has been a change in the behavior of the system (fault 
detection). There are applications where we would like to 
uniquely determine the change in the system (fault 
Peadluwm@bion). This last capability requires that there exist 
a unique one-to-one relationship between the value of the 
system parameter that changed, and the resulting value of a 
model coefficient. This is obviously not possible when 
Peaere are two or more structurally different but 
equivalently performing model equations. 

The existence of a model equation that exactly describes 
a finite set of input-output measurements does not imply any 
uniqueness properties [Ref. 15]. This can be demonstrated 
by considering particular examples of structurally different 


but equivalently performing models for a given input. 


Example 1.1: The time series of meaSurements 
Seem 1,295,.00,7,-.-.-} = {1,1,3,7,17,41,99,239,...} and 
SeeeenemeeO lee, 3,..-,7,---} = {0,1,2,5,12,29,70,169,...} are 


equivalently described over any interval with n>1 by both of 


the linearly independent model equations; 


y(n) = u(n) = y(n-1) Os ete 
and 


men 1) + y(n—1) ieee 


y(n) 
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Example 1.2: The time series of measurements 


Miner n=O, 1.2,...,7,--2) Cll. ceo oo ee we and 
Weel te.--,7,---) = 10,1,0,2,=3,.7,-15,32,...} are 
Squawaiently described over any interval with n>2 by both of 
the linearly independent model equations; 

Pe) = wen) + y(Cn=71) (als 
and 

y(n) = u(n-2) = y(n-1) + y(n-2) fe 

Example 1.3: The equation 

mene = Suen) —- -SuCn=-1)y(n-1) ers) 
and the equation 


y(n) = .9u(n) = .45u(n-1)u(n=-1) + .25u(n-1)u(n-2)y(n-2) {1.6} 


Can be realized as shown in Figures 2 and 3 respectively. 


y(n) 





u(n) 





Figure 2: A block diagram realization of Equation {1.5} 





Figure 3: A block diagram realization of Equation {1.6} 


2) | 





These two different equations will produce the identical 
output sequence {y(n);S<=n<=T} for any given input sequence 
meeos == 6}. Note that {1.5} amd {1.6} are not 
independent since {1.6} can be obtained from {1.5} directly 
by the following recursion. 

Start by replacing n by n-1 in {1.5}. 

y(n—-1) = .9u(n-1) = .5Su(n-2)y(n-2) Es Lee fa 
Sumer ituting {1.7} into the right side of {1.5} yields: 

mom) = .9utn) — .5u(n-1) [.9u(n-1) - .5u(n-2)y(n-2)] 

= .9u(n) - .45u(n-1)u(n-1) + .25u(n-1)u(n=2)y(n-2) {1.8} 
Semee {1-8} i8 the same as {1.6}, therefore {1.5} and {1.6} 
are not independent. There could be the case where {1.5} 
was the system and {1.6} was the model, or vice-versa. A 
change in one system parameter would not be uniquely related 
to a resulting change in one model coefficient. This also 
explains why an siteeaitue tally. obtained model may explain 
Dre@giceagbility, but not cause and effect. 

Measurement data of a physical system can be matched by 
more than one model equation for a number of reasons. Ite 
may be by coincidence; the input may keep the output of each 
model exactly the same. It was recently shown [Ref. 16] 
that if the system is linear, the existence of linearly 
independent models that match the input-output data can be 
attrabutbed to the particular inputs that generate the 
measurements from which these independently equivalent 


models are computed. 
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Mime preceding discussion shows that the following 
problems must be considered. If we obtain a set of models 
each acceptably describing the input-output performance of a 
System, can we determine if any of these models include the 
Structural properties of the actual system that produced the 
data? Naturally, we should expect that we will rarely be 
able to obtain a model with the same detailed structure as 
the system. The second problem is how to determine the 
“Demc' model for a particular application from those 
candidate models in the equivalence set. Chapter VIII will 
address these problems and present some new results in the 


SComvext Of a DParticular application. 


©. OVERVIEW 

Chapter II provides a review of existing general linear 
and nonlinear model forms and presents an extension to a 
unifying general model that we will use. Chapter [III 
presents various error minimization techniques for 
evaluating candidate model equations and proves the 
generally superior modeling performance of one particular 
‘technique known as the "Covariance" least squares method, 
over the least squares technique commonly used in the 
Mater acure . Additive meaSurement noise is examined, and new 
expressions are developed for the resulting distortion of 
the fitting error and of the coefficient estimates of linear 


recursive models. 


es 





Gmeapter IVY diseusses the equations required to evaluate 
the performance of a model as more terms are included, and 
Presents 4 recursive technique for evaluating the solution 
Or a lange and useful class of equations. The computational 
advantages of this technique are compared with the direct 
least squares evaluation of each model. 

Chapter V discusses the existing model growth techniques 
based on the parameter estimation approach. It shows that 
the recursive technique of Chapter IV reduces to the 
commonly used parameter estimation technique known as 
Levinson's Algorithm when two restrictive assumptions are 
made. The use of these asSumptions typically produced 
inferior models compared to those produced by the more 
general technique. Chapter V also discusses several 
possibleagnonlinear model growth algorithms that are logical 
extensions of existing linear growth techniques. 

Chapter VI presents a new concept in iterative model 
growth based on the developments in the preceding chapters, 
and amalyzes the advantages and limitations. This heuristic 
technique is shown to offer significant improvements over 
Pme existing and previously discussed methods. Chapter VII 
gives the results of computer simulations and real world 
experimental comparisons of the model growth techniques 
developed in this research. 

Chapter VIII examines three additional applications for 


tie modeling methods developed in this thesis. They are 
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heme a@eteection, fault evaluation, and reduced order 
modeling. Specific techniques are discussed and a number of 
eoncepts are proposed. Conclusions are given on the key 


results of this work and areas for future research are 


outlined. 
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Il. CHOICE OF THE MODEL EQUATION FORM 


A. EXISTING MODEL FORMS 

Wwe are concerned with the determination of discrete-time 
models for both linear and nonlinear systems. The logical 
Starting point is a discussion of existing linear model 
Bomms. We limit discussion to single-input, single~output 
Syeteme for simplicity, but using a vector notation, the 
results can be directly extended to the multiple-input, 
multiple-output case. We also limit consideration to models 
whose input-output relationships can be described by time- 
invariant difference equations?. 

After a discussion of linear models, the few general 
dynamic nonlinear models forms found in the literature are 
presented. A more general nonlinear model form that 
subsumes the preceding linear and nonlinear models is then 
discussed. One particular version of this general form is 


then developed in greater detail, and utilized in the 


remainder of this work. 


e It is recognized that there are systems where an input- 
output relationship cannot be exactly described by a 
difference equation. Consider a discrete quantizer whose 
output y(n) at any instant n is equal to the integral part 
of the input u(n) if the fractional part of u(n) is less 
Gham 0.5, “and is equal to 1 plus the integral part of u(n) 
if the fractional part of u(n) is greater than or equal to 
Or. 5. Clearly the input-output relationship of such a system 
cannot be accurately described by a difference equation. 
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Linear dynamic discrete models include the moving 
average (MA), autoregressive (AR), and autoregressive-moving 
average (ARMA) forms. They are well described in the 
literature (Ref. 11, 17, and 18] and are briefly discussed 
here for completeness. 

The MA model predicts the current value of the output of 
a system as a weighted summation of the current and q 
consecutive preceding values of the system input, where q is 
KnoWn aS the order (or memory) of the MA model. Following 
the previously used convention, the sampled observations of 
the system input are denoted as {u(n);S<=sn<=T}, the system 
ouspuc as {y(n);:S<en<=T}, and the residual error due to 
inaccuracies in the model as {e(n);S<=en<=T}. The model 
equation can be written as; 


mG) u(n) + a1) u(n—-1) +...+4 aU ean u(n-q) + e(n) 


y(n) 


q 
= ’. A? (4) u(n-i) + e(n) 25. 1} 
T=0 


lime Coeiticients are the a4) factors thae mulciply each 
corresponding a delayed input term. The (q) superscript 
is used to emphasize the dependency of each coefficient 
value on the order of the model, and the Superscript 
notation is dropped when no ambiguity results. 

These models are called moving average because the 


current output is a weighted average of a finite "window" 


2a 





Dessing Over the present and past input values. Models of 
mae fomm of Eq. {2.1} are denoted as MA(q). 


The pth 


order AR model predicts the current value of the 
system output as a weighted summation of p consecutive 


preceding output values. Using similar notation, the 


equation for this model is written as; 


(Pp) (Pp) (p) 
Ce omen )eeob €2) y(n—-2) +...+ b (p) y(n=p) + e(n) 


camer y(n-j) + e(n) ene 


ufo 


a2 

The coefficients are the 5 re) factorsmtnat multiply seach 
corresponding ae delayed output term. Models of the form 
of Eq. {2.2} are denoted as AR(p). 

Despite their simple form, MA(q) and AR(p) modeling of 
even simple linear systems often requires an excesSively 
large number of model terms (a high order model). A natural 
emvension of these two models is a combination of both. 

Such mixed models are called autoregressive-moving average, 
or ARMA, models of orders p and q, and are often written as 
ARMA(p,q). The ARMA model predicts the current value of the 
system output as a weighted summation of the current value 
of the system input, q consecutive preceding values of the 
system input, and p consecutive preceding values of the 


System output. Combining the previous notations produces 


the model equation; 
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(q) (q) 
min) =@© (0) u(n) +@& (1) ul(n-1) ee ea) u(n-q) 


(Pp) (Pp) (Pp) 
-B Cry sy Cn |) my; C2)" ¥ Cn) es) (P) y(n=p) + e(n) 


q (q) PAP) 
= § @& (i) uln-i) + YB (j) y(n-j) + e(n) = {2.3} 
r= 0 ‘ce 


The coefficients are the 66) snail Gap factors that 
multiply each corresponding ae delayed input term and 
oo delayed output term, respectively. Real world linear 
Systems typically include feedback, and can be adequately 
modeled with the smallest number of terms by an ARMA model. 
The literature discusses two general dynamic discrete- 
time nonlinear models. The Voltera mkdel (Ref. 19, 20 and 
21] can be thought of as a nonlinear generalization of the 
MA model. This model predicts the current value of the 
System output as a linearly weighted summation of increasing 
degree products of the current and m consecutive preceding 
input values. Using the typical notation followed in the 
Becveratvurée ash a Guide, the equation for this model form is 
Written as; 
d 
y(n) = _ f Cu(n-i);i=0,1,2,...,m] + e(n) foe 


where 


m k 
a (g +8 se 200 pf ) u(n—-g,.) 
se Kol 2 K IT er eea5) 


Re 


tee isocalled tne aaa Volterra kernel. 


ZS, 





= 


meen the degree d equals 1, Eq. {2.4} reduces to the form of 
tae MA model. 

The Volterra model is based only on a sum of products of 
past and present input values. Because of this nonrecursive 
form, the Volterra model is unable to compactly represent a 
System that includes significant feedback. A model of the 
form of Eq. {2.4} is denoted as VOL(d,m). The lower limits 
of the summations in Eq. {2.5} were purposely chosen to 
eliminate redundant terms, and therefore we have minimized 
the number of equations that need to be solved in the 
evaluation of any particular VOL(d,m). The upper summation 
limits of Eq. {2.5} are all set equal to the integer m for 
womatvional clarity at this point. We could, of course, use 
a more general notational convention for the upper summation 
img (@.g. MH 3; 151,2,...). A more general upper summation 
limit notation would produce more complexity in the 
equations, and offers no specific advantages for the problem 
examined in this thesis. The Volterra model of a system may 
not require all of these terms indicated by Eq. {2.5}. 

The Bilinear model (Ref. 22, 23, 24 and 25] predicts the 
current output of the system as a linearly weighted 
Summation of the current and m consecutive preceding input 
values, plus a linearly weighted term composed of the 
product of one of m preceding output values with the current 
or ome of m preceding input values. Using the typical 
meitation found in the literature, the equation for this 
model form is written as; 
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m 
Pon) = > i) u(n—-1) 
i= 0 


= oe (m) 
+ » y eC, j) Wins y(n=-j) + e(n) lee Me, 
he y=! 

This form includes bilinear terms composed of the 
prmoagucts of specific input and output factors, a feature not 
found in the previously discussed model forms. However, it 
is limited to the one type of nonlinear form shown above. 
Mee@elissof the form of Eq. {2.6} are»-denoted as BIL(m).. The 
restriction to equal upper summation limits is again used 


mor cidaagpity. The Bilinear model of a system may not require 


all of these terms indicated by Eq. {2.6}. 


B. DEWELOPMENT OF A MORE GENERAL MODEL FORM 

When used for the modeling Of Ja vty plcalenon linear 
system, the Volterra model form suffers from the same 
limitations as the MA model form does for linear systems. 
The existence of any feedback in a system will usually 
result in the requirement for the order m to be very large 
inee@. t2e.1} im the linear case, or in Eq. {2.4} in the 
nonlinear case. This property of nonrecursive model forms 
Peeurts in the need for an unacceptably large number of 
model terms to adequately represent the behavior of typical 


systems. 
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De wawwmat extension of the Volterra model is to include 
molterra-lmke terms of the output of the system, in a manner 
Similar to the relationship between the MA and the ARMA 
models. An investigation of the effect of feedback in some 
common nonlinear systems leads to the conclusion that it is 
also useful to inelude terms that are extensions of the 
Bilinear model form. Such an extension has been made and 
Seme Dpartial results concerning different versions of this 
new model form have been published [Ref. 9, 26 and 27]. 

One version of this model form was called the Nonlinear 
ARMA model in references 9 and 26. To better distinguish 
the properties of a more general form of the model, it is 
denoted as the Bivariate Volterra Model (BVM) in reference 
ce, wad imethe work that follows. 

The coefficient notation of the previously discussed 
linear and nonlinear models forms follows the conventions 
hound in the literature. Considerable thought was given to 
the need for suitable notation for the more general and 
complex model form, and also for the developments that 
follow. It was decided to have a uniform coefficient 
notation that could be applied to any of the models of this 
G@eepocer. Following is a compact equation for BVM(d,m), a 
BVM of degree d and memory m in terms of this uniform 


Goerfiecwent notation. 


Sa 





r 
a ar as a alka 
1 a 
m S 
2 Ser eee ss alan 
= = 


G S 


-,h,)] u(n=g) ] [y(n-h,) + e(n) 
=| =o 
G2 


ime eeatficientis’ are the factors starting with Or.g: 


9 and O,. im @a. {2.7}.  WNote that two subscripts and 


OG"s * Ss 
one or more parameters in parenthesis are included for each 
coer ficient. A P59 eoerrrerent is used in conjunction with 
a term composed exclusively of r products of past and 
present impue factiors. Likewlse, Os coefficient is used 
mm Cenjunetion with a term composed exclusively of s 
Broe@uets of past output values. The subscript in each of 
these cases distinguishes the number of such factors in the 
corresponding model term. Finally, we use Pres for the 
coefficient of the model term composed of r input factors 
emd S output factors. In all cases, the parameters in 
perrenthesis in each model coefficient distinguish the 
partieular lag factors. 

A model of this general form can be completely described 


by either specifying a particular degree d and memory m, or 


by just providing the distinguishing parameters and 
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Sscripts of each desired coefficient. Following is an 
example of a second degree, first order BVM equation which 


is denoted as BVM(2,1). 


y(n) = 8), (0) ulm) + 0,961) uln=1) + 05,9 (0,0) means 


Cy Ga ie 


+ Epo: | u(n)u(n=-1) + 85.9 
+ Oh (0 +1) u€(n)y(n=-1) + Org > 2 Uta 1) y CoS 1) 
a Waren) + Bats) ACHETOC Sea 13.8} 


Ia?s coefficient notation completely specifies the model 


terms, aS demonstrates in the following examples. 


@4.561,2,3) ie the coefficient of term u(n=1)u(n=-2)u(n-3) 


Mpeg. 1. 1) iw thee coefficient of term u(n)u(n=-1)y(n=1) 
ime Cierce of the various lower summation limits in Eq. 
{2.7} eliminates redundant model terms. The upper summation 
limits are set equal to m for notational clarity, as was 
aeons tor the VOL atrrd BIL forms. Because the upper summation 
limits of Eq. {2.7} were all set equal to m in the preceding 


pages, a compact expression for the number of coefficients 


in a full BVM of degree d and memory m can be written as; 


g Ss r o 
d YT [(m+i) d | Edam dd Tl itm+id] ae 
jz j= 


> Tol e ox yy | 
ra s! rt s! 


r=] Sil r=1 s21 {2.9} 








This equation is used in subsequent chapters when the 
computational complexity of evaluating this full model form 


1s considered. 
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The BVM form defined in Eq. {2.7} is limited to the 
Summation of products of integer powers of past and present 
maput terms, and past output terms. Ottrver functional forms 
besides sums of integer products are possible, but this form 
appears to be most tractable for our modeling. Examination 
mrmea. i2.)} through Eq. {2.6} confirms that the BVM form 
subsumes the AR, MA, ARMA, VOL, and BIL model forms. 

For example, an ARMA(p,q) is subsumed by a BVM(1,m) when 
the degree d of the BVM is set equal to 1, and 
m = maximum (p,q). This only allows terms with descriptive 
coefficients en 87? ee ©9931 65): where s0<=i<=meand 1<=)<=m. 
This includes all the terms of an ARMA(p,q). 

The BVM form is emphasized because it includes the other 
general forms discussed in the chapter and commonly used in 
the literature. Using this BVM form rather than the ARMA, 
VOL or BIL forms, typically produceS a more compact and 
accurate representation of nonlinear systems with feedback. 


Chapter VII contains several computer simulated experiments 


demonstrating this point. 
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IIT. EXAMINATION OF ERROR MINIMIZATION TECHNIQUES 


Ae 6DISCUSSION 

Assume we are given any particular model form linear in 
some finite number c of unknown coefficients, and a set of 
input-output data of length N>c. We are interested in 
determining the particular model equation that, in some 
meaningful sense, best approximates the performance of the 
system that produced the N output measurements from the 
corresponding N input measurements. There are different 
error criteria, including least squares and minimax, that 
could be used in minimizing the error residual. Least 
squares techniques minimize the average squared residual 
sequence value ina given interval, while minimax techniques 
minimize the maximum absolute value of the residual sequence 
mi th® interval. 

Difficulties with least squares, including degraded 
modeling performance under noisy conditions, have been 
reported (Ref. 13]. Nevertheless, it has been decided to 
investigate the use of least squares techniques for the 
fombewine reasons: (1) least squares minimization for models 
linear in the coefficients leads to a set of tractable 
linear equations in the unknowns, (2) there exists a large 
body of parameter estimation techniques in the literature 
based on least squares, and it is possible to extend some of 


these for our model growth and evaluation problem. 
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mnis ceMapter presents both the theoretical differences 
and results of computer simulated experimental comparison of 
various least squares formulations when applied to systems 
Goneracterization. In the simulation study, two criteria 
used for comparison purposes are: (1) average squared 
Mestoawat Value (fitting @rror), and (2) accuracy in 
estimating model coefficient values. An examination of the 
effect of additive output noise is presented in the last 
section. 

An example of a linear-in-the-coefficients nonlinear 
difference equation uSing the coefficient notation 
mrerodueed in Chapter II is; 


men) = Bg‘)? u(n) gay (1+ 1) y(n—-1)y(n-1) claw | 5 Oe ibe ce 


mo. {307} contains both linear and nonlinear terms in the 
woe Uin) and output y(n). Sinee the coefficients Ores all 
enter in a linear fashion, this equation can be expressed in 
compact vector notation (all vectors in this thesis are 
column vectors). Defining a coefficient vector 9, where 


a’ 


emo Po.961+1), os ae (S523 
and a term vector x(n), where 
x(n)? = [u(n-1),y(n-1) y(n=1) ,u(n) y(n=1) J [3.3] 
Pa. (321} Gan now be expresSed in the vector form; 
ren) = o*x(n) {3.4} 
Assume that we are given a finite set of measurements of 


the input sequence {u(n);S<=n<=T} and the corresponding 


output Sequence {y(n);S<=en<=T} of a time-invariant and 


Beil 





causal system of unknown structure. To reproduce the input- 
output behavior of this system within some moderately small 
error, we choose a linear-in-the-unknown coefficients 
predictor model equation of the following form. 
yGaq) = Q°x(n) + e(n) isis Sy, 
where e(n) is the equation error of the model at time n; 
@ is a vector of length c containing as yet unknown 
coefficients corresponding to each model term; and x(n) 
is a vector of c model terms, each of which is formed 
Mrem the product of a finite number of input and output 
factors from the Set: 


Putnam) ,u(n=m+i1),...,u€n-1),u(n),y(nem) ,y(nem+1),...,y(n—-1)] {3.6} 


where m is some finite number called the memory (or 

order) of the model. The maximum number of input 

S@erors or output factors in any such product 

combination will be called the degree d of the model. 
Note that the above description fits the BVM(d,m) introduced 
ao Chapter II. 

The following example is used repeatedly for 
Lllustration. Consider a linear MA model with q =m <= 2, 


and the coefficient and term vectors; 


Q = (6,,,(9), 953961) 81,962) (3 (3 
x(n)" = een) yuCa=1),u(n-2) } boc s 


This model form is linear in the unknown coefficients 
once the x(n) is specified, and we choose to minimize the 


following monnegative least squares error criterion; 
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y) 3 2 
7= = 1 » See Sa uae ope e 
(ng-n,+1) n=n, N n=n, 
where n, and ny take on fixed integer values. 


To carry out the least squares fit in compact vector- 
matrix notation, we use underscored lower case letters to 
represent vectors and capital letters to represent matrices. 
Scalars are represented with lower case letters whenever 
possible, but occasionally capital letters are used. 


Define the output vector y, where 


y 


yo = Cy(n,),y(n,+1),...,y(n,)] er 10) 
and the data measurement matrix X, where 
» 
X = [ x(n) x(n,+1) ene x(n4) eee, 
Smeemetouecrng 13.5}, {3-7}, {3.10}, and {3.11} into {3.9} yields; 
Ik 
Pen ty - xe)" ty - x79) = 1 | yTy - oxy - yTxo + grx'xe 
N : N { 12} 
Pollowinmg standard least squares theory (Ref. 11 = 14], 


foes EN Sl Uetion “equation for the coefficient vector 9 that 
minimizes Eq. {3.12}, and the corresponding value of the 
minimum error criterion are determined. The details of the 
well known least squares derivation are included for 
notational development. Differentiating Eq. {3.12} with 
respect to the vector 9 using matrix calculus and equating 


the result to zero yields; 


Sa = 0 = Py + 2 x XO {3.13} 
G2 N N 
Simplifying Eq. {3.13} produces; 
1 x go 2 a xy £3.14} 
N N 


oe 








ae ts Convenient to use the following compact notation for 
fms set of c Simultaneous linear equations. 


RO SF ag is ee is 


where R = xX ex {3.16} 


moeoume positive semi-definite least squares matrix of size ec, 


and - sO ou tt 


i 


‘zz 


mes COLUMN vV@e@tor of size c. The factor 1/N is retained in 
emese definitions for subsequent distinction. 

To eure that Eq. {3.13} represents a unique minimum, 
the second derivative of jigs with respect to 9 must be 
positive. Weplying this result to Eq. {3.12} yields the 


ead@d condition that; 


O7;2 . 
Oc'do 


Bawecwions {3.15} are known as the normal equations, and 


[ x*x | =WeR > 0 {3.18} 


=| 





there is a unique solution if and only if matrix R is 
Peseitive definite. Thissuntoque solution is; 


9 = gtr {3.19} 


Using {3.16}, {3.17}, and {3.19} with {3.12}, the minimum 
2 


Venue OF J~ is; 
yO | toe 
N 
eae - 3 e (3.20) 
N 
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We now investigate how the effect of the choice of the 
values of n, and n, ieee 13540). cChangess the eresuiting 
value of the error criterion Wo. The main purpose of this 
investigation 1s to identify the reasons for the observed 
numerical differences in the results of various least 
squares formulations that commonly are used in the 
Preorature. Many recent researchers [Ref. 1 - 9, 13, 14, 
amd 16 =~ 29] put émphasis on computational simplicity and 
make assumptions or approximations related to the values of 
O 3 and n, merao lndiwee Special structure into the solution 
equations {3.19} for 6. We consider a number of distinct 
eeees that are discussed but not clearly compared in the 
secvéerature. 

C19 1f n,<S+m and n,<sT, this is equivalent to the 

aSemmption that u(n)=0 for n<S and y(n)=0 for n<s, 

and is Known in the literature as the "Prewindowed" 

eater [ Ref. 4, 6, 28 and-29]. 

Caer Int n, >=S+m and nT, this is equivalent to the 

aaeiimiption that u(n)=0 for n>T and y(n)=0 for nT, 

is Known in the literature as the "Postwindowed" 

case [Ref. 29], and is seldom used. 

Coem Lf n,<S+m and n,>T, we get both prewindowing 

ana) BpOstewindowing since this is equivalent to 

assuming that u(n)=0 for n<S and noT, and also 


Bimgemy¢n)=0 for n<S and n>T. This is equivalent 


to rectangularly windowing the measurements. cs eS 
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euomm as the "Autocorrelation" case [Ref. 7, 14, 25, 
and 28 - 30], and is the typically used method. 
(4) If n,?=S+m and n,<-T, no window is applied 


to the observed measurements, and the so called 


“Sovaertance"’ case is realized [Ref. 28, 29 and 31]. 


Depending upon the specific choice of ny and ny there are 
many different least Squares error criterion values 
sl, n,) . and related model coefficient estimates Otn,.n,), 


por @ B@iven set of input-output data. The literature 
typically reports the use of the Autocorrelation method for 
Statistical considerations and because this can often lead 
mo "an efficient solution algorithm. THiS pointe is discussed 
Pumener in @ subSequent section. 

Examination of these four different methods from the 
unifying framework of the least Squares equation {3.9}, 
reveals an interesting comparison basis for explaining the 
Subsequent differences in form and performance. This 
development does not appear in the systems identification 
literature and clearly indicates which error minimization 
method should be used for the performance modeling approach 
to the general model growth problem. The main result is 
that the Covariance method generally gives Superior modeling 
Pegults in terms of lower fitting error i and more accurate 
model coefficients in the vector 9. The differences in 
these four methods are described analytically in terms of 


the following example, generalized in the theorem that 


42 





- 
; 





mee bows, and finally demonstrated in a computer simulated 
experiment. 


EXAMPLE: Beco isn and 1 = 10, Then we have the data 


{u(n) } Meyers), UC3S),..., ul10)} {3 


eee = {¥C1), yC2), yC3),..., y(10)} | She 


meu tre model be given by the equation: 
men) = 91.969) u(n) +94.9(1) u(n-1) +95. 62) u(n-2) + e(n) 


Using least squares 


n 


i ae eae {3 
2 
where WN = na-no+l, and=witere the coefficient vector is; 
ic 
G9 = ( P5969? > 91.96): 91.962? ] {2e 
leads to 
ism le = sexy (3 
N N 
where 
ee = ile Gt yy (2) 9903)... ony 10)] {3 


and X is the N x 3 data matrix involving {u(n)}, whose 
Sonvencs depends upon the choice of no and ny as shown in 


Ehe followimg four cases. 


U8 


ely 


223 


{3 


24} 


25} 


pyle 


ets 


nee 





— ee 3 = = 10, and the data matrix becomes; 
u(1) 0 0 

nee) =U C1) 0 

eso UCZ)) 6 61) 

u(4) u(3) u(2) 

eS), wee) ) =u (3) {3.28} 
Ueo) sue 5) |6u C4) 

ier) 06) ul5) 

Hos) ut7¥) u(6) 

meg) ucts) ul7) 


u( 10) u(9) u(8) 


The solution of the normal equations {3.26} is now given by; 
aa 


32.95 





Metce that the square matrix in Eq. {3.29} has different 


Summation limits along each diagonal parallel to the main 


diagonal. 
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Case 2: 


Postwindowed method 


ee — ———————— 


Here nN, =Seme l+e=3, n 3 =T+m=2=10+2212, and the data matrix becomes; 


u(3) ul2) u(1) 
u(4) ul3) ul2) 
u€5) wi4) u(3) 
x = meen wuts) ulC4) (32308 
ery C6) ul5) 
uGs) WK7) ul6) 
wo9) wits) wul7) 
u( 10) u(9) u(8) 
6) Meo) ul9> 
0 0 u( 10) 


The solution of the normal equations {3.26} is now given by; 





10 al 10 | 10 ie 
dee Ca u(n)u(n=1) 1 1} u(n)u(n-2) 
TO nes ; TO dass (er Ss 
ee +—-—-————-— ——-4—-—-—-—-—-----— 

| 10 | 10 
= (eel u(n) ius > u(n)u(n-1) 
P10. nee 110 Bre 
arson fe me we i Se | a 
l 10 
SYMMETRIC ieee y. ucn) 
| 10 n= 
10 
dias u(n)y(n) 
16, Be. 
10 
” u(n-1) y(n) 
TO Wes 
10 
a u(n-2) y(n) 33 la 
10 = 


Note that the square matrix in £q. {3.31} has a different 
set of summation limits along each diagonal Daiealwed, toOsone 


@aan diagonal. 
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Geisie 3: 


Here Nn» aS =01), n3 


=T+m=10+2=12, 


hatocorrelation method 


and the data matrix becomes. 


a1) 0 0 
mee) ul 1) 0 
mse ute) wl) 
om) = ul3) w2) 
mS) wG@t) wd 3) 
k = Weer ues) 8 ut 4) (ersey 
mc7) uGOde u(5) 
meoy uly) us) 
eo) a(S) wt 7) 
eNO wues) ule) 
0 u(10) u(9) 
0 0 u(10) 


mae SOoluUtTiON of the normal equations {3.26} is now given by; 





10 ”) ! 10 { 10 : 
power | i u(n)u(n=1) 4 u(n)u(n=2) 
n=1 ie n=2 tte n= 

—_ ome ce ewe oe = oe oe ae oa a eet 
10 9 10 
oe, PE UG) 11 > u(n)u(n-1) 
ire n= 1 Ia nee 
ae a ce i ee 
10 5 
SYMMETRIC iit. Yun) 
b- beanies 
10 
ee. wn yn 
(ee Fw | 
10 
1 u(n=-1)y(n) 
2) nee 
10 
if 3 wene2) y(n) {3.33} 
ie n= 3 


Wore Ghae the square matrix in Eq. {3.33} is symmetric, 


Toeplitz (equal values along every diagonal parallel to the 


main diagonal), and the summation limits are all the same 


along any diagonal parallel to the main diagonal. 
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Swe partaicuWar structure of the symmetric Toeplitz 
matrix in Eq. {3.33} was developed here strictly from a 
consideration of the error minimization limits. The 
literature contains numerous references to least squares 
Deon ces Withethis special structure, but it is usually just 
stated or developed along different lines?. After 
presentation of the fourth case, we will discuss the 


imepli@ations of each. 





Shor efample, Baheti (Ref. 23 and 24], Hsia (Ref. 14], 
and Iserman [Ref. 13] all utilize what they call 
“correlation analysis" where they aSsume that the input and 
output sequences are ergodic, such that this special 
7ee@Ppli tz strvueture results. This "ergodic assumption" can 
be described mathematically as follows (Ref. 14, pp. 44]. 
ConSider a finite length discrete-time sequence of 
measurements of some signal denoted by {s(n)}. If this is a 
representative sample of an ergodic process, then the 
following condition will hold. The value obtained from the 
expression; 


_ 
i+N 
1 ps s(n)s(n-j) 
+1] T= 1 
is Pmvariant with respect to the integer i. If this special 


Soemerttr2on Nolds, or isS assumed, then the Toeplitz structure 
of E@. {3.33} will result because of the relationships; 





i+N i+N i1+N+1 

mls on Seo atn=j) =_1 >. s(n+1)s(n+1-j) = 1 y s(n)s(n-j) 
N+1 nei N+1 n=l N+1 n=1+1 
i+N i+N+2e 

=i >. s(n-2)s(n+2=j) =_1_ > s(n)s(n-j) 
N+1 n=l N+1 n=i+2 


ne 








a OO eee 


Case 4: 


Covariance method 





Here n 5 sS+me 1+22=3, n, =T=10, and the data matrix becomes: 
Une, Ge) wt 1) 
u( 4) u(3) u(2) 
mes) ~ =«69t) we) 
y : eo UG) uC 4) {3.34} 
uc7) ome 6) w65) 
UC is) aS) (7/9 a i Gio 
Ue, wil B) ul7) 
ero y ug) us) 
The solution of the normal equations {3.26} is now given by; 
0 | 10 l 130, : 
wy » u(n) all viGe eee or ili u(n)u(n-2) 
8 n=3 18 n=3 \ 8 ns 
a gn cpr a is ee as E 
1 10 ie 
= iii u(n-1) ipa u(n-1)u(n-2) 
18 n=3 8 n= 
Se SS eS a aa ee ee fae rie ees ili lls Sew Gee ee 
y 10 
i 2 
SYMMETRIC 1. u(n=2) 
Ts Sis as! 
10 
1 >), uca)y(n) 
es 
10 
u(n=-1)y(n) 
n= 3 
10 
u(n=2) y(n) SVs ioy, 
ae 3 





Note that the square matrix in Eq. {3.35} is symmetric but 


not Toeplitz, and the summation limits are all the same. 


The main reason for the preceding four-case development 


men lO MOlint Cut the Specific condition under which the least 
squares Matrix becomes Toeplitz. Titi SepreOperty tS exploited 
in Levinson's algorithm (Ref. 1], which solves the normal 


equations with order of complexity proportional to the size 
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Soom least squares matrix R, rather than proportional to 
Dae cube, Ol the Size of this matrix as occurs in the other 
eamee cases Shown. Details of this algorithm are discussed 
later in Chapter V and Appendix B. For models other than 
Simple moving average, other researchers [Ref. 3 = 9 and 21 
- 25] constrained their model forms and used the 
Autocorrelation method to form least squares matrices with 
Toeplitz principle submatrices, and therefore gain some 
computational advantage when solving these equations using 
variations of Levinson's Algorithm. This chapter proves 
that this technique is cumbersome, unnecessary, and more 
importantly, generally produces inferior models compared to 
those obtained by the Covariance method. 

The constrained Autocorrelation method models are 
inferior in two main ways: (1) Only specifically related 
sets of model terms necessary for the special Toeplitz 
structure are allowed in the model. This limits model 
growth flexibility, increases the computational burden, and 
degrades the model performance. Further discussion on these 
powmes 15 given in Chapter V. (2) The particular choice of 
data interval described by the Autocorrelation least squares 
method (or its statistical equivalent) typically produces a 
model With significantly higher fitting error, and 
Ssiestamclally larger coefficient error than the Covariance 
method. This key point is discussed in more detail in the 


next two sections. 


aa 





BE. & PHREOREM. DESCRIBING THE CONDITION FOR SUPERIOR 

PEnRPORmMANWCE OF THE COVARIANCE METHOD 

The four previous methods use exactly the same form of 
computation; they differ only in the specific data 
measurements used. The Prewindowed, Postwindowed, and 
muvOocOrrelation methods supply missing zeros, either before 
OT i cver the meaSured data, or both. Thus these methods are 
arithmetically equivalent to the Covariance method operating 
oa a "aLseontinvgous function, and it is well known that it is 
hard for least squares or any other minimization method to 
Mend lie discontinuous funetions. An alternate explanation is 
Ene “the first three methods utilize constraints on the data 
weeues, and it is generally found that a constrained 
solution is inferior to the optimum (minimum valued) 
solution. This suggests that the Prewindowed, Postwindowed, 
ad AWitocorrelation methods would be inferior to the 
Covariance method. 

Simple computer simulation experiments given in the next 
Section confirm this reasoning. It remains, therefore, to 
mathematically express this feeling and these results that 
Supplying messitte data by a run of zeros iS a poor method to 
use. We are, of course, concerned with the quality of the 
fit, the sum of the squares of the residuals. 

The first step is to examine under what circumstances 
the Prewindowed, Postwindowed, or Autocorrelation methods 
would produce a lower average error than the Covariance 


method. Some mathematical notation is needed. 


18 





momeaiager 4a finite set of dynamic input observations 
mony) >Scen<=T} and corresponding: output observations 
eye 2 S<=n<=1} of some system, and a linear-in-the- 
coefficients model equation relating the present value of 
y(n) to functions of past values of y(n) and present and 
past values of u(n). Denote the integer m as the maximum 
discrete lag (order) of term of the model equation, and 9 as 
the coefficient vector. The model equation can be written: 


aa puCn=1),y(n-j);220,1,2,...,m;jz=1,2,...,mj] +e(n) (3.36} 


2 
al 


represent the average squared error obtained when a least 


Let fe, (n)} remgresent tne error residual and J 


Samieres minimization is performed over the interval (n,,n4). 


n 
‘kane = a (eeu 
1 XN. n=n t 
| aa: 
where n, = S+m oes) Sy) 
and n, = 1 13339) 
and Ny = n, - Nn, + 1 {3.40} 


Let the length of the error minimization interval be 


increased by ‘a small amount N,>0 to a larger region (nj on ) 


2 4 


where n,<n, and/or Gea. This new region includes the 
first region and available data points on either or both 
Sides of the first region. Missing data points in the new 


data matrix X are padded with zero values. Using the same 


model form, perform a least squares minimization over the 


interval (n,n ). Denote the new error residual as fe, (n)} 
and the average least squares error as I. 


Dl 











Jy = 1 e,(n) 
Nh, n=n, 
ea n n 
2= 
= 1 om e,(n)*+ We e,(n)*+ y 2in)?| {3.41} 
N +N, nen, n=n, n=n, +1 
where N. z (an, -ny+1) - Ny Pee Ee 
Since I is the least squares fit over (n,.n4), 1 Cenusie 
Gee hess than or equal to the quantity; 
n 
3 
mp enn) ade 
A. n=n 1 
1 2 


Z : ; ; 
Let —& be the nonnegative value representing this loss of fit. 


n 

gE? = 1 x [¢ Canoe 2 (ny | {32494} 

N. n=n - - 
I 2 

THEOREM 1: 


A necessary and sufficient condition for 


2 2 
J, < wg {3.45} 


is.that the following condition must be met: 


n=! oF 2 ie 2 a — 
- >» e,(n)” + y e,(n) < >. e,(n) =a (ee ee 


= nen : N ele |a: N 
2 Ll 3+1 il 2 Z £3.46} 
PROOF: 
Smoot cutiag {3.37}. {3.41}, and {3.44} into {3.45} yields: 
™, — 1 13 n n 
2 2 4 2 3 Z Z 
: 1 > e, as mo — - 2 e,(n) \s alts » BG) - E 
1% n=n) nen n=n,+l Ny n=n, 
eS 
Multiply by sai and transpose the middle term from the left. 
naz = n N. +N n 
i 2 P 2 Le 3 - 2 
> e,(n) + >. e5in? | - J >. e5(n) ~ [Ny +N, E 
way wen, +1 Ny n=n, 
sete 


Die 





Dividing both sides by N yields our desired condition. 


n -] n a) n N 
v_| in| e,(n)* + 7 e,(n) \¢ 1 is e,(n) -| + J E 
N men n-ne | N. nen 
2 3 l 2 fe 
Ls a |: 


ct 


Ie 
ho 


<— 


ies  hogical tosask how the condition of Eq. {3.46} 
Soulag*@rise. The condition states that the average fit over 
the added end regions must be less than the average fit over 
Pee Brde@le region minus the last term on the right side. 
Since N,N, | 


would be significant, and the error of the end regions must 


erie Jaet term on the right side of Eq. {3.46} 


be much smaller than the average error over (n ny) POnweq. 


c 
{3.46} to hold. Two obvious special cases can arise that 
Beeve®y Cite condition of Eq. {3.46}. 
C1] In the case where the forced zero-valued data 
points in the expanded region correspond to the actual 
input sequence and the natural system dynamics (and no 
noise), then it follows that 16 in {3.44} will be zero. 
As an example, consider a stable system excited by the 
following waveform {u(n)}. 
uien 9 


n, n, ae eh 


The output {y(n)} might have a similar shape. 





a5 





Pewee “preceding figure, the expanded data region 
contains actual zero-valued input-output values, e,(n) 
will be zero in the expanded regions, and I <a i. 

fe) If the first region (ny +n4) eoOmtatinicmdatal values. that 

don't exactly fit the model equation, and the additional 

measurements in the larger region (n,.n,) happened to 

Comeain Gata that exactly, or almost exactly fit the 

model equation, then the average error over the larger 

region could be lower. 
Both of these special cases are possible, but it appears 
highly unlikely that either will occur in practice. The 
special requirements on the data sequences for these cases 
ome wt x@m@ples of pathological situations. The probability of 
their occurrence is so small as not to be meaningful. 

The value of Theorem 1 resides not in the elegance of a 
mathematical proof, but paaatiee tts DPrOot sus sse Simple and 
its meaning so important. Theorem 1 basically proves that 
any least squares error minimization method other than the 
Covariance method, will produce a higher average fitting 
emror in ail but unlikely pathological cases. Therefore, 
any systems characterization or parameter estimation 
technique based on a least squares minimization method 
different than the Covariance method (e.g. Prewindowed, 
Postwindowed, Autocorrelation, etc) will generally produce 


Geiogiptmemal fitting error results. This result is important 


for the work that follows, since it is well known that 
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approximations made early in certain recursive algorithms 
Gicem Grow and lead to significant errors later on, and we 
want to use a recursive algorithm to efficiently evaluate 
the model growth. 

The next section provides some computer simulated 
experimental verification of the results of this section. 
Other factors affecting the accuracy of systems 


characterization and parameter estimation are also examined. 


GC. SiMVUELATION EXPERIMENTS 
Peper imemt 1: 

DESCRIPTION: An investigation of the effects of various 
least squares error methods, and the length of the observed 
data (S<=-n<=TJ], on the accuracy of the characterization of 
Known typical linear and nonlinear systems. 
CRITERION: Square root of the average sum squared fitting 
error, J. Note that we minimize ie but examine J. This is 
@one for clarity of graphical presentation. 

For the first part of the experiment, we synthesize the 
MA(3) system; 

¥(m) = 1.0u(n) + .8u(n=-1) + .6u(n—-2) — .3u(n-3) {3.49} 


ime tol lowane Test Procedure is used repeatedly. 


Teot PROCEDURE: 
Generate a random sequence for {u(n)}, uniformly 
distributed between the amplitude limits ([-5,5], and start 


Onwe Mieput thromeh the system (with zero initial conditions) 


512) 





me Giaerete time n=1. Record the observations 
fee is <en<=T} @nd {y(n);S<=n<=T} for S and T* specified 
below, and use them to minimize the least square equation 
error of {3.12}. Examine the use of the Prewindowed (P), 
Postwindowed (W), Autocorrelation (A), and Covariance (C) 
methods. The value of S is chosen to be 11, and T varies 
from 50 to 1000 in steps of 50. The experiment is carried 
ome over am ctisemble of ten (10) runs, with different, but 
equivalently distributed, random input sequences. For the 
data obtained from the ensemble of ten runs, plot the 
maximum, Minimum, and average value of J, as a function of 
the value of T and of the choice of minimization method. 

For the second part of the experiment, synthesize the 
following ARMA(2,2) system and repeat the test procedure. 
y(n) = 1.0u(n) + .8u(ng1) + .6u(n-2) - .9y(n-1) - .7y(n-2) {3.50} 

For the third part of the experiment, synthesize the 
following BVM system and repeat the test procedure. 

y(n) = 1.0u(n) + .8uC(n—-1) + .6u(n-2) -.9y(n-1) - .7y(n-2) 
+ .2u(n)u(n) + .15u(n-1)u(n-4) = .3y(n-2)y(n—-4) 
= ,.6 wenm—1)y(n=1) + .05uCn—-2) y(n-4) Sais 10: 

Figures 4 through 7 present the maximum, minimum, and 
average values of J versus T and the choice of the least 
squares error minimization method for the MA(3) model of Eq. 
{3.49}. As expected, the A, P, and W methods show improved 
performances with increasing T, but even at T=1000, these 


methods are significantly inferior to the C method. 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 4: PREWINDOWED ANALYSIS OF A MA(3) SYSTEM 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 5: POSTWINDOWED ANALYSIS OF A MA(3) SYSTEM 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 6: AUTOCORRELATION ANALYSIS OF A MA(3) SYSTEM’ 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 7: COVARIANCE ANALYSIS OF A MA(3) SYSTEM 
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mois conjectured that the slight increase in the 
average value of J as T increases with the Covariance 
method, 1s due to the finite precision of the computer used 
for these experiments. The experiments could be repeated 
using double precision variables in an attempt to verify 
this conjecture. We are actually approximating the N 
equations X98 = y for the 4 unknowns 9. Since there are many 
more measurement equations than constraint equations, it is 
natural that the average error should be slightly higher as 
T (and therefore N) gets larger. 

Figure 8 shows how the choice of the four error 
minimization methods affect the matrices and vectors 
involved in the evaluation of the MA(3) model, for different 
values of T. Note that the R matrix and r vector have been 
normalized by dividing each element by the firsterow, first- 
column entry of R. This does not affect the answer and it 
provides for an easier comparison of the twelve cases shown. 

Since the Covariance method uses only the exact data 
meaSurements, we denote as exact the values of R and r in 
the Covariance method of Figure 8. The corresponding matrix 
and vector in the other three methods can therefore be 
considered to have errors. The important thing to recognize 
here is that errors in the third decimal place in the values 
©®f the R Matrix in these other methods, translate to more 
Sigmificant errors in the inverse of R, and ultimately into 


substantial errors in the esStimates of J and Q. 
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Figure 8: CONTENTS OF MATRICES AND VECTORS UNDER VARYING 


CONDITIONS, FOR A MA(3) MODEL AND SYSTEM 
vector r have been normalized by the first: row, first column entry of R.) 
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While we are interested in determining the least squares 
method that provides the minimum J, we have also evaluated 
the typical offset in the coefficient estimates that result 
from the use of these four methods, and present this 
information in Figures 9 through 12. The erratic behavior 
of the A method in eStimating the coefficient values appears 
to be the result of the padded zeros on both ends of the 
data matrix X. The appearance of Similar curves in Figures 
9 through 12 for the A and W cases can be explained as 
follows. Both the Autocorrelation (A) and Postwindowed (W) 
cases have padded zeros at the bottom end of the matrix X 
meee by “Eq. {3.30} “ad {3.32} respectively. The effect of 
the padded zeros in the Autocorrelation case X matrix 
decreases as N gets large, and the A and W cases approach 


each other in this ] iri t . 
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Figure 9: COEFFICIENT ESTIMATION OF A MA(3) SYSTEM 
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Figure 10: COEFFICIENT ESTIMATION OF A MA(3) SYSTEM 
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Figure 12: COEFFICIENT ESTIMATION OF A MA(3) SYSTEM 
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mmeres & through 12 show the significant differences 
mesuiting from the choice of error minimization method, 

This choice directly affects the contents of R and r, which 
in turn affects 0 and therefore J. A logical conjecture is 
that the condition number (ratio of largest to smallest 
eigenvalue) of the matrix R, could be a good indicator of 
the quality of the least squares fit. In other words, the 
more well conditioned the matrix (lower condition number), 
tae lower the corresponding fitting error J. While 
Semerecically pleasing, this conjecture is not born out by 
experience. Greover 30 cases of linear and nonlinear 
systems modeled using each of these four error minimization 
methods, the corresponding R matrices were all well 
conditioned (low condition number), there was no significant 
difference in condition number between the four methods, and 
there was no direct correlation between lowest condition 
numbers and lowest fitting error J. Condition number data 
Beene lu@ed in the typical results of Figure 8. 

(ies 1S explained by the following. The fit J is a 
Mermecion Of the entire coefficient vector 9 and the vector 
feomace Ehe coefficient vector 9 18 a function of both the 
meerix HR (actually the inverse of R) and the vector r, the 
Gondition nutfber of Ris an insufficient measure of the 
accuracy of 9, and therefore is an insufficient measure of 


mae duality of the fit J. 
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Fagures) 13 throweh 16 present the results of the 
@m@periment for the ARMA(2,2) model of Eq. {3.50}. Figures 
17 through 20 present the results of the experiment for the 
BVM(2,4) model of Eq. {3.51}. Both of these sets of figures 
indicate the superior performance of the Covariance (C) 


Meknod in winimizing the fitting error criterion J. 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 13: PREWINDOWED ANALYSIS OF AN ARMA(2,2) SYSTEM 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 14: POSTWINDOWED ANALYSIS OF AN ARMA(2,2) SYSTEM 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 15: AUTOCORRELATION ANALYSIS OF AN ARMA(2,2) SYSTEM 
73 





a 
a 





J. THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 16: COVARIANCE ANALYSIS OF AN ARMA(2,2) SYSTEM 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 17: PREWINDOWED ANALYSIS OF A BVM(2,4) SYSTEM 
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Figure 18: POSTWINDOWED ANALYSIS OF A BVM(2,4) SYSTEM 
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J. THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 19: AUTOCORRELATION ANALYSIS OF A BVM(2,4) SYSTEM 
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J, THE SQUARE ROOT OF THE FITTING ERROR CRITERION 
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Figure 20: COVARIANCE ANALYSIS OF A BVM(2,4) SYSTEM 
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This experiment has lead to a greater nites Poicand i ive Ot 
tame accuracy of the different we cation Pechnadues with 
respect to the size of the observation sequences for a 
nonrecursive model, a recursive model, and a nonlinear model 
of the BVM form. For the models examined, the Covariance 
Meeet SauUares €rror Minimization method is Superior to the 
Prewindowed, Postwindowed, and Autocorrelation methods. 

These results are representative of those obtained with 
other system equations. The conclusion to be drawn from 
Simulation Experiment 1 is that the Covariance method is the 
momwc accurete of the four methods. Since our primary 
problem 1S accurately characterizing systems whose exact 
mathematical form is unknown, the Covariance method is 
adimmnec FOr the rest of the work in this thesis. This 
a eae ineroducing offset errors in J and 9 that might give 
misleading results later in our model growth techniques. 

The next section examines another factor that can affect 
the estimates of ye and 9; output measurement noise. Ais 
P3erneludeda for convenience at the present time, and is 


Pomerred to again later on. 
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D. EFFECTS OF OUTPUT MEASUREMENT NOISE 

If the system output is contaminated with additive noise 
{yv(n)}, then the evaluation of the model and the estimates 
of the model coefficients may be a ARE. If the additive 
noise is uncorrelated with the system input, and the model 
is nonrecursive (e.g. no 99:q OF Sp;q terms in a BYM which 
thereby reduces to a MA or VOL model), then the effect of 
the additive noise on the coefficient estimates will 
generally be small. This effect approaches zero in the 
limit as the size of the data segment (T-S+1) gets large. 
This property of a nonrecursive model is well known in the 
literature [Ref. 13 and Ref. 14, pp 41 and pp 144]. 

To better understand the effect of uncorrelated additive 
O@tput moise im the case of a linear recursive a ae 
denote the noisy output sequence as {z(n)}; 

am) t e¢(n) + +m) for all S<=n<=T {3.52} 
[Oo Wiiseeze tHe meQuat ion error Minimization techniques 
discussed at the beginning of this chapter, we substitute 
2(n) for y(n) in the evaluation equations, and Eq. {3.5} 


becomes; 


T 


z(n) = 9 X(n) + e(n) ose | 


4 Other measurement noises such as additive input noise 
or multiplicative input and/or output noise are also 
poesitole, but are not considered. 


> The following analysis holds for linear recursive or 


nonrecursive models (e.g. ARMA). There appears no tractable 
way to extend it to recursive nonlinear models (e.g. BVM). 
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mere under the condition of additive output noise, 9 is the 
coefficient vector, and x(n) is the corresponding term 
vector based on u(n) and z(n), instead of u(n) and y(n). 


The """ is used to denote factors affected by the noise. 














x(n) = x(n) 
fy(n)} = {2(n) ] 
= x(n) sal) 
= = u(n = 6 
(ht tytn) } y(n)} = {v(n)] 
Se) x(n) {3.54} 
Bote that if the model is nonrecursive, x,(n) = 0, 
and x(n) = eum). A More inter@sting example is as 


follows; If ee th cudn)Gutn=1))7(n-1) |, then 
Oe i ee et ) 2(m-2)) fe u(n),u(n-1),y(n-1)+v(n-1) | 
inal Fa) [20.6 4(H-) | 

Substituting Eq. {3.53} and Eq. {3.54} into Eq. {3.9}, 
and minimizing with respect to 8 bya calculus, yields 


the least squares solution equations; 


eC Pwiae gta {3.55} 
N N 
where zi Lz(n,),2(n,+1),..-,2(n,) | {3.56} 
and 

Bie ph. ; ; 

x = [ Z(n,) k(n,+1) «..  &(n3) ] {3.57} 
Substituting 2q. {3.54} and {3.57} into {3.55} yields; 

fer > eel O=-r + ry Boos) 
where the following matrices and vectors are defined; 

ee ie [3.59] 
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RA KS Ky ace, 
N 
x= [ x(n,) x(n,+1) ... x(n3) 1 [3.61] 
% 
Smee See) ayn + 1)... xy(n,) ! aoe 
———- a Ky ne, 
N 
Se ae hee, {3.64} 
N 
age 1)... v(n,)] [3.65] 
Solving {3.58} fom tame model coefficient vector 8; 
a sal: 
oe ieee eee] Te + ry | as 


mae Tirst tem on the right side of Eq. {3.66} can be 


Simplified when the inverse exists. 


-] nis -] 
eee) oo [| reer rR, |] 
| id = s 
=< | +R al ie {3.67} 
substituting {3.67} into {3.66} and using @=R ‘r from 


Eq. {3.19} yields; 


- — a 
2 = WT ek S| OR : bod eee 
Shi a a a 
i ae Reet | 8 ty + [ IT +R el R = 
F = _ = 3 
= [ I +R RY] 9 + [ T + R RO] R ey {3.68} 


Note that when the measurement noise {y(n);S<=n<T} is equal 


wOomeze ro. use reauces to the null matrix, ne reduces to the 


Vv 
null vector, and Eq. {3.68} Vagewds 6 = ©. We are interested 


in the noisy measurement case, and denote the distortion in 


the model coefficients as 9 a: where 


|D> 


9 = 


94 - 9 BSS) | 


wees tititing Eq. {3.69} into Brel « {3.68} yields an expression 


moe wero OlToetvoerticon in the model coefficients. 
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-l -l -] -l 
g. [ I + R Rig 9 eee feed R Poo = 8 

Zs -l _ =a! 
Meret 1 O+ Ror) = 6 {3.70} 


=V 


Eq. {3.70} gives an exact expression for the coefficient 
Muotortion due to additive output noise, but its meaning is 
hard to appreciate directly because of the four inversions. 
The first term on the right side can be expanded in a 
geometric series; 


as =o) a ll +2 = +3 
i ct ae “ite J =e) R 1 els R OR = [—--R 1. | i ee es rer ale 


Vv V Vv 
This series is valid when the absolute values of the 
eigenvalues of matrix [ RR] Hee al! bess otianiel.9 Matrix 
powers greater than one are negligible when the eigenvalues 
are small compared to one. These conditions are met when 
ne total power @©f the additive noise is small in comparison 
Womene tetal power of the system output; i.e. high SNR. 


Usime this ass#mption, Eq. {3.71} is approximated by the 


first two terms of the expansion, and Eq. {3.70} becomes; 


; za “i 
9G = [(aeaq Ryd Pe Fer Tie) - 9 
-l =a) -j -l 
2 g- RRO + Ror, - RRR, - @ 
-l =i 
= R [CC & = BR om = Rea {3.72} 


The above equation can be interpreted as describing the 
model coefficient distortion vector as composed of the 
difference of two vectors. One 1S a constant term, and the 
other is a multiplicative function of the noise-free model 
eoeflTicientis. Nowe that the only inversion needed for this 
approximation is that of the matrix R. Also, both vectors 


amee Wa mwectly pireportional to the inverse of the matrix R, 


83 





wawen ls yadependent of the particular additive noise 
characteristics. 
Time distortion on the estimates of the model coefficients 


is therefore the difference of two vector; 
=H 


= 1E 
Me - & te, - my Pony = tl - RRR ry i272) 
and 
-l 
Sm = R RVG (Sea 
where 94 = @.- 83 {3258 


This shows how the coefficient distortion of a linear 
recursive model depends upon the choice of the particular 


model terms, time averages of the system input and output, 


and time averages of the additive output noise. 
As an illustrative example of the effect of noise on a 


linear recursive model, consider the ARMA(1,1) model; 


¥Gn) = 92969? u(n) + e.g! u(n-1) + 0 Ciery Ca=1-) {3520} 


ae 


The following matrices and vectors are written by inspection; 


x(n) = [u(n), u(n-1), y(n-1)] (3.77) 
x(n) On. 0), We n=) ) (oes 
u(n,) u(n,-1) y(n,-1) 
X = u(ny+1) u(n,) y(n,) 
u(n, u(n,-1) y(n,-1) { Seg} 
0 v(n,-1) 
0 v(ny=1) 3.80} 
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u(n,) u(n+t) — u(n,) y(n,) 


S> 


= u(n,-1) u(n,) eter u(n,-1) ytage) 
yn, =) y(n.) es y{n,-1) 
; y(n.) 
1 7 u(n)y(n) 
N n=n, 
zx ee os a a 
= { > u(n-1)y(n) 
n=n, 
i ce 
LS? y(n-t) y(n) (5.81] 
N n=N» 
0 0 0 v(n,) 
aS 0 0 Rie: 0 yin,+t) 
yay, - 1) v(n,) yee, 3 
’ v(n,) 
-;\ i —_ © 
-3 
ith aL v(n)v(n-1) 
Mien. {3.82} 
u(n, ) u(n,+1) a u(n,) u(n,) u(n,-1) page) 
= u(n,-1) u(n,) a rT a ie oe? ey 
nay) y(n, ) a vine) 7 | : 
u(n,) u(n,-1) y(n,-1) 
n n 
eG | { i onan 1) ) : iu a u(n)y(n-1) 
L ay , I n=n» N n=n, 
 #, = 
r 4 by u(n-1 ) Cees y u(n-1)y(n-1) 
| N on=n N n=n 
| 2 | 2 
aa eS ce ee Se 
| 3 
t i2 } y(a-1) 
SYMMETRIC n=n, [3.83] 





O 0 nae. 0 ) 0 v(n,-1) 
R. = O 0 O 0 0 v(n, ) 


v(n,-1) v(n,) or Carma 


0 ] 0 | O 
ee = ee 
= O | om ii O 
<a F-- , oY 
0 | 0 pi v(n-1 ) 
& | |N d=n, (3. 84} 
l 
1 = aoe : ®130 \°? 
7 Pe ee 
—— CO] = 
O on , FP? (n-1)? 9 Gi) 
ly 4 Oak 
1 immmtae | 
2 
O 
= O 
n 
3 
a) se alone 
, N n=n, {3.85} 


Representing matrix R of Eq. {3.83} in the following shorthand; 


f {3.86} 





jme inverse of wmetrix R canbe written as shown below. 





Z ) n | (af - e*)/IR| (ce re) teeth mente” 
ote TRY) - eer sav Ter Pa fe eT 
—a— KOR ee 4—-—-—- - ee GS ie 
k : q : (be - cd)/|R] | (ee - ae)/|R| . (ad - b rar 
(3aeG 
where IR = adf + 2bec - cfd - bf - eta {3.88} 
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erreeeuemnie 15.65; and {3.87} into {3.74} yields: 


3 2 
ou = 9651 es v(n=-1) q 
wo n=n 
Ss {3.89} 


me seamucmne {3762}, {3.84}, and {3.85} into {3.73} yields: 


k 
nN nN 2 Sa 
= i Tn. v(n)v(n-1) 1-s >> ¥(n—=1) q 
N n=n, N n=n, a pene 
S {3.90} 


Swostmrurmars {5.09} and {3:90} into {3.75} provides an 


Snore sston for tme distortion of the coefficient vector; 


1 i V Ga vy Ga— 1a 1 as Ds vin=1) -9.. Carnal yi v(n—1) 
Woon W onen O51 T fee 
2 2 





weere kK, aq, ands are elements of the inverse of matrix R. 
We can now directly examine the effect of additive noise 
Sneeme Woesitoiemt distortion vector, in t®rms of the time 
averages of the additive noise. If the additive output 
Geese, 1S Grsodiag and uncorrelated with itself, the first 
emm Oulmuee Grgne side of Eq. {3.91} is small and will 
approach zero in the limit as N=—eoo. The magnitude of the 
Cm@etiierence Gistorrion values will be directly proportional 
PO, OOtCh Une Sample autocorrelation of the output noise and 
the value of the reeursive coefficient. Under this 
condition the coefficient distortion vector equals the 
NMemative of Eq. {3.89}. The value of Qq4 indicated by Eq. 


{3.91} has been observed in simulation experiments. 
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The preceding example serves to demonstrate the new 
insight that is available as a result of the development of 
Pamationa {3.52} through {3.75}. The effect of the presence 
ene properties of the additive noise on the distortion of 
tne model coefficients follows directly from an examination 
of these equations. The impact of this will be addressed 
aeein im CGhaptrers VI and VII. 

From the preceding development of an expression for the 
distortion in the model coefficients resulting from additive 
output noise, an expression for the related increase in the 
patting efror caw also be obtained. Substitute z(n) for 
y(n) in the dewelopment of Eq. {3.20}, denote the minimum 
Grror fitting criterion resulting from the noisy data as i. 


and make use of {3.52} through {3.69}. 


=" n ee 
a> 2 y > z(n)” = r79 2 
N n=n, 
T iE 
ee Merete eet vaelo-~ [| r+r, J] [ 09 +6, ] 
- - x * = -~ 7s =~ =4 
N 
-tyty+2v¥y+ivv-r'e-r'es-rJe- rye, (3.92) 
N N N 
; ee 2. 2 
Tamove wee di StOrtZOn Wn the minimum value of J as Ja: 
i 2, J- {3.93} 
Sueur ueiee {3.92} into {3.93} and using {3.20} yields; 
(Peet vy - ro, - ro -r,'o {3.94} 
ies = -  - - oS 


Using the assumption that the additive noise {v(n)} is 
ergodic and independent of the system output f{fy(n)}, the 


Piroe term Ou the right side of {3.94} is small, and 
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approaches zero as N=@oco. Using this common assumption and 
Sumerremeing {3.72} into {3.94} yields; 


Iles a 
ys = viv -rR {€ I - RR Par, - R*9} = 


d 


“6 


T=] 


=i 
age it ft - rk ke ir, - R90} {3.95} 


From Eq. {3.19} we have a! = ripot. Substituting this into 


Eq. {3.95} and simplifying gives; 
2 ae T 2 Ta | 
Ja 3 = vay oe if oo “ye , - By oo Bad - ry R ev 
T -1l -l T.=1 
t sBies R RR Geek Sy R RO 
7 I T Te-l ai -l 
= ~2 Beee Roe = fr. RF + 20 Jl eee ee }-s OStsie 


Bqiemion {3596} is a new expression for the distortion 
jm Ghie TittCing error criterion of a linear recursive model 
caused by additive output noise. Note that if the model had 
Deen Nenreeursive, vector ila, would reduce to the null 
yveetor » and matrix RY would -edene GO the null maiaeix. In 
bikes special ease, Eq. {3.96} reduces to; 

i = viv =o *? wep. 

N nen, {329 
ome CMe distortion in the fitting error criterion would be 
equal to the average power of the additive output noise, as 
expected. 

The ARMA model of £q. {3.76} is used again in an 
iliuetravive example of the effect of noise on the fitting 
waneot Oo aelrmear recursive model. Substituting Eq. {3.76} 


through {3.88} into {3.96} produces; 
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! 
+ = waweeaqeewaewewae ah @= = es = 
1 
1 4 ° 

ae aegeeawe @eawee wea se age = a 
oe 
ls 1 > v(n)v(n-1) 
| N nen , 
2 
n 2 n 
3 2 3 
= il y Vv Gae +fo5..01J lo >. v(ne-1) 
N n=n, N n=n, 


: . i 2 
1 v(n)v(n-1) See 2) Oe (1) 1 - s viCn=— 1s) 
sd, ; ; wee al N bn, {3.98} 


ie vie neise ivOm)! is ergodic and uncorrelated with 
Peeett, the factor premrltiplying the third term on the 
right side of Eq. {3.98} is small, and approaches zero in 
the limit as N-—eco. USiune enls common assumption, the 


Micculm, werror difetortiom reduces to the following; 


n 2 n 
Jq = x ‘a woos + [95.160 a a ene 
n=n» N n=n» 
2 n 
= 1 + [95.461 E: a v(n)? for large N 
: N n=n, {3.99} 





mre preceding equation shows that for uncorrelated additive 
output noise, the increase in the fitting error criterion is 
proportional to both the power of the additive noise, and 
one plus the square of the magnitude of the recursive 
coefficient. If the model form had more than one recursive 
merm, the resulting equation for i would appear more 
complex, but would follow a related form as this example. 

The preceding development provides new insight to the 
actual effect of additive output noise on the 


characterization of linear recursive systems. 


91 





IV. EVALUATION OF MODEL EQUATIONS 


a RESTING FECHNIQUES 

Given a set of input and output measurements and a model 
equation that is a function of these measurements and linear 
em a Set of coefficients, Chapter III showed how to obtain 
estimates for the coefficient values and the error residual. 
The literature reports (Ref. 17] that the ARMA model can 
reasonably represent most linear systems of interest using 
oweers less than m = 10. A current problem is the 
efficiency of computation when larger and more general model 
forms like VOL and BVM are considered. Regardless of the 
degree or memory of the model, the calculation of the model 
fit involves solving the normal equations {3.15}. 

Th@s se@eGion diseusses the traditional direct least 
squares model evaluation technique. The next section 
develops a unified solution technique for the more efficient 
recursive evaluation of a wide class of models. The last 
section compares the computational features of these 
evaluation techniques. 

The traditional modeling technique starts by selecting a 
first gedel y(n) = O x(n). We include the index parameter 1 
mo lG@@mbify Chas fimst model, and write the prediction form 
equation as follows. 


win;,i)” 6= (1) > x(n,1) + e(n,1) Coe cal} 


Je 





Using {4.1} in place of {3.5}, and following the same 
least squares development as Chapter III, yields the normal 
equations corresponding to Eq. {3.14}. The index parameter 
is included where needed. 


BA iE 
afxcnxcp}ecdy =a xcney [4.2] 
N N 


This leads to the model error evaluation and coefficient 


estimation equations in terms of this indexed notation; 


-1 
371) = 1 yy - ma cp sate ll.) a8 
N 

-1 

0( 1) = WFC 1) “rie {4.4} 
where 

ae 

Wei) wee F C1) X01) {4.5} 
N 
i 
meer oréi) = 4a X(T) ¥ 1 4761} 
N 


If the fitting error hee is too large for the 
Seplweation, the traditional systems identification 
technique is to select a larger model that contains the 
terms of the first model plus some additional gees. This 
second prediction form model iS written as shown below. 

yaru~2:) = 9(2) > x(n,2) meen, 2 ) {4.7} 
The technique forms equations like {4.3} and {4.4} for the 
medel of {4.7}, amd continues until the fit mee of model 
number i is within some acceptable limits. This is a brute- 
romGe. an@s= imeifieient approach Since the evaluation of the 
second (and subsequent models) does not take advantage of 


wae solution calculated for the previous model(s). 
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To appreciate the above point, we digress momentarily to 
demonstrate the computational complexity (as measured by the 
number of multiplications or divisions) involved with 
calculating the inverse of the matrix R(i) when the model 
rorm@ is the BYM(d,@) introduced in Chapter II. Equation 
{2.9} gives the number of coefficients in a BVM as a 
funetion of the choice of the degree and the memory. Table 
1 shows the number of coefficients, and therefore the size 
of the corresponding R(i) matrix, for any BVM of degree up 
fo 6 and’ memory up to 10. Ines Chapter, thecnotation 
e(i) i6 weéd for the size of the eye? model regardless of 


its form. 


m= 
ye 
m= 
m= 
m= 
R= 
a= 
a= 
Ans 
m= 


0 
1 
4 
2 
y 
5 
6 
‘i 
8 
9 
1 


0 


S| 





TABLE i: Number of coefficients ina BVM of degree d and memory mom 
The cOmiputational cost of inverting a matrix R(i) of 

Smre ct1), ts of te order’ of 1/3 time's the cube of c(i) 

multiplicative operations. Table 2 shows the approximate 

niwmoer Of Such operations required by the direct least 

SCuware® tectiniaqwie for the inversion of the matrix R(i) 


corresponding to a BVM of degree d and memory Mm. 


g4 








d=1 d=2 d=3 d=4 a5 d=6 


re ———— a — ss a ees ee ee 


m=0 1 3 9 ee 42 T2 
m= 1 9 243 22a 13100 55450 190600 
m=2 4o 2667 55460 651000 5271000 3.266E7 
m= 3 115 w.F29E4 5.617E5 Ve 1a 7 ToS 0b G 1.681E9 
m=4 243 5.249E4 S250 1H 6 eye Sars! 2.671E9 SB. 17010 
m=5 4u4yy ioe k5 1.594E7 8.459E8 Catton Om 6.347511 
m=6 fem 3-/50E5 5.8e2E7 4.488E9 2.096E11 OF O57 Ble 
m=7 li~s §§.200E5 1.804E8 1.940E10 1ee242ZEte 5.326613 
m=8 1638 1.638E6 4.925E8 7.143E10 6.087E12 3.429E14 
m=9 2287 3.043E6 Weta BS 2.31451 1 eC PoOOr 13 i Ae 5) AN ES 
m= 10 3007 5.33886 2.760E9 6.746E11 9.487E13 8.646E15 


TABLE 2: Number of multiplication operations required for the 
matrix inversion involved in the direct least squares 
evaluation of a BVM of degree d and memory om. 

It is clear that for degrees above 3, the inversion of 
the matrix R(i) required for this direct least squares 
evaluation of model i, rapidly becomes prohibitively 
expensive Tor in@reasing door m. For problems of interest, 
however, we want to evaluate such higher degree and/or 
memory models of the BVM form. 

It should be mentioned that for large c(i), the 
computation of the elements of the R(i) matrix requires 
approximately c(i)N operations. This can dominate the 
Gompuication time if W >> c(i), as is typically the case in 
whe li tVerature. Even though the correct model forms were 
used in the three examples of experiment number 1, the 
Autocorrelation, Prewindowed, and Postwindowed methods still 
peqguere@ a large ®Wto obtain a small fitting error. On the 
otner hand, the Covariance method gave superior performance 
Mithows requiring WN >> c(i). These results are typical of 


those obtained with other computer simulated experiments, 


ohe. 





and indicate that an equivalent performing model solution 
can be obtained more economically with the Covariance 


method. 


B. PRESENTATION OF A RECURSIVE EVALUATION TECHNIQUE 
Toedeweclop efficient a@lgorithms for evaluating models of 
merase ML allow ous to relate the equations and solutions of 
Various @Models. This notational change is important for 
subsequent developments. We reorder x(n,i) and Q9(1), 
respectively, in amanner described below. We denote the 
reordered x(n,i) as w(n,i), and the reordered O(i) as p(i), 


svich thet Bq. {4.7} becomes; 


ye, 2) = p(2) w(n,2) + e(n,2) {4.8} 
where a(e,2) = ie ae) ] {4.9} 
and ° wont) ee 6x(n,1) {4.10} 
and w(n,2/1) is a vector formed by starting with x(n,2), 


deleting all of the terms that also exist in w(n,1), 
and reducing the size of the resultant vector by 


eliminating the spaces of any deleted elements. 


J0 T 
Mc Witeee w(2) = [ a(i/2) | pl2/1)’ | et 
d 
and p(1/2) = onv(1) evaluated at the a” "4 teration ek a 
and p(2/1) is a vector formed by starting with 9(2), 


deleting all of the terms that also exist in p(1), 
Gm@amredmcineg the size of the resultant vector by 


eliminating the spaces of any deleted elements. 


Sie 





It remains to show that the evaluation of model equation 
{4.8} can be accomplished more efficiently than the 
evaluation of the same model given instead by Eq. Agee i 
Before demonstrating this result, the preceding notation is 
generalized for models beyond the first and second. 

We recursively define an equation for model i of size 
e(i) in terms of model i-1 of size c(i-1), where 
ma) > clam). 

y(m,i) = pli) w(n,i) + e(n,4) 4.13] 
where w(n,i) and p(i) are size c(i) vectors defined in the 
same manner as Kq. {4.8} through a ee such that, 

w(n,i)- ey , oe # Ghee ] {4.14} 
and 
Bis [epeici/i)) | pla/i-1) 1 {4.15} 
Following the standard least squares development yields 


the normal equations corresponding to equation {4.13}: 


1fwoa)' wa) ] (a) pani wi) y [4.16] 
N N 


where we have the c(i) x N transposed data matrix; 


wi) = [w(ny,t) x(n, 


eee went, 2) ] Ta 
The solution of tea ataG. | is the coefficient estimation equation; 
aie) as P mCi) R¢i) Tere 3 {4.18} 
mae SOluteHonm FOr the model fitting error criterion is; 


mei) = 1 yly - mee) DGG) aa) 14.19} 
N 


where the c(i) x c(i) least squares matrix is; 


Be San) ai) (4.20} 


N 


of 





Sea the c(i) x1 vector d(i) is defined; 


a(i) = 14 wli)ty 4.21} 


Bneicad of Solving Eq. {4.18} and ea on we use {4.14} 
and {4.15} to develop a set of recursive model evaluation 
and model coefficient estimation equations’. Dehua mat te 
represent the number of terms in the (i)th model that are 


not contained in the (sit) lene model. 


mem = c(a) - cli-1) eee } 


6 The recursive model evaluation equations presented in 
this chapter were independently developed; but they turn out 
to have many sgimilar features with a published algorithm by 
Hsia [Ref. 14, pp 2a |: Hsia's algorithm is designed for 
recursively computing the least squares coefficient 
estimates of a model as the number of terms in the model 
increases. The equations developed in this chapter have 
e@uivalent recursive capabilities for coefficient 
estimation, and also include some features that are used 
later in the thesis for the more general model growth 
problen. 

The form of our equations turn out to be somewhat 
different since we have some matrices and vectors that do 
not appear explicitly in Hsia's algorithm. These matrices 
and vectors are basic to the development of the main model 
mroewth results of Chapter VI. We also include a recursive 
Seqwation for the model fitting error based on a simple 
Siwenmsiom of owr particular form of the coefficient 
evalWation equations, without explicitly solving for the 
model coefficient estimates. Using our equations, we are 
able to show how it reduces to a special but widely used 
efficient recursive parameter estimation technique (Durbin's 
[Ref. 2s modification of Levinson's Algorithm) when two 
restrictive (from a performance modeling growth perspective) 
assumptions are made. Since we need our form of the 
equations and these other features in the remainder of this 
thesis, the equations are developed at this point. 
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Say 2.6.1. tante a 4 | into {4. 7}, and define the Pon lowin Ps 






it T 
m(n,,i-1) wim ,i/ 1-1) 


W(i) 


1 

i 

" 

Bk 

gtn5*1,2-1) | w(n,+1,4/i-1)” 

; 
T 
priaa= |. ) 


w(ng,i-1) w(n 


3 


= LC ete) : w(i/i-1) ] ames) 
° t 
where W(i) is the N x c(i) data matrix for the (i) model, 
st 
W(i-1) is the Wx c(i-1) data matrix for the (i-1) model, 
and W(i/i-1) is the N x q(i) data matrix for the new terms 
og OR , st 
im “the (i) model that are not in the (i-1) model. 


Substituting Bq. {4.23} into Eq. {4.16} and simplifying; 






A(i-1) een i-1) h(i-1) 


es en == em eww mw wem Fo, @ -  }]) ell A RS EA EA EA CT ED 

T 
Be/Gek) | | MGH/ i=" ) Gy ia) | ao 
where the following definitions are made for convenience. 


Ria = 1) se (9-1) (4-1) eee te) xe re(i-| ) matrix i425] 
N 


B(a/i-1) = 1 w(4-1) W(i/i-1) eae) ere Gr malt ra x {4.26} 
N 


T 
ee at) ame a / i) wCi/i-1) , a gli) x q(i) matrix eee 
il 
i 1 ) eS 1 wii) z , a cli-1) column vector {4.28} 
N 
mea / iat o> nGH/4) y » a afi) column vector {4.29} 


N 
The set of linear equations {4.24} Powapswecial jeumuted 
form of the normal equations for the Go: model. This 
special form results from the ordering or w(n,i) described 


by Ba. ia .14}. and leads to an efficient set of recursive 


oe 





Solutiom equations for J 


Sy 


Weide oCn weil also provides the 


basis for our unified approach to the model determination 


and growth problem examined in the next chapters. 


a: 8) [EG EE CC See 


The set of simultaneous linear equations {4.24} has a 


2 
compact solution for p(i) and J (i), based on the previously 


2 
obtained p(i-1) and J°(i-1). Appendix A contains this 


development and we only state and use the results here. 


mets Conven@ens to use the following definitions; 


F(i) = TY 6c Gee , a e(i-1) x q(i) matrix {4.30} 
ey wily ia1) — e@iya-1) A(4=1)* BC4/2-1) 
= aA(i/i-1) + B(i/i-1) F(i) , a ali) x q(i) matrix {4.31} 
ao) tis im1) - B(i/i-1) p(i-1) 
wim a/i-t) + P(i) n(i-1) » a q(i) column vector AO S2 | 
k(i) = Gi) e(i) , a q(i) column vector {4.33} 
Wei long as | a(i/i-1) {| #0 es ay 
then the solution of {4.24} is given by 
mG.) “> Lae ‘ seed fate ) (ae 
a) I 


Meme © Wee the null vector and I is the identity matrix. 
The resulting minimum average sum squared error value 18; 

ee G1 - 1) CR) (4.36) 
ine tom, sche following recursive.relationships exist; 


- oe ( - 
~1 CTS) ) Pee CLD Aes 7 F(i)G(i) 1 
Le = | — a >= 
oem, Rta) 


n(i)- era) | ey en ] {4.38} 
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C. CAPABILITIES OF THE RECURSIVE EVALUATION TECHNIQUE 

We now demonstrate some of the advantages of using 
equation {4.35} as an alternative solution to {4.18} for 
model i. We showed that evaluation of equation {4.18} for a 
BYVM(d,m) requires the inversion of a matrix of size c(d,m) 
given by equation meio}. Examination of {4.30} through 
{4.38} reveals that only one smaller inversion of size qe) 
need be performed to evaluate Eq. {4.35} and Eq. {4.36}. 
This is the inversion of G(i) required for the calculation 
Bem(i) in FR. {4.53}. 

Tom 4 SFE, te size of matriz Gli) at the (i)th iteraion 
ie @2¥ven by the following equation. 

size of G(i) = q(i) = c(d;,m,) - e(d.., ,ms_, ) {4.39} 


where d;; m 


i dil? and Mj.) are the BVM degree and memory 


i 

at iterations i and i-1, respectively. The computational 
cost of recursively evaluating models using kq. 14.35} and 
Eq - {4.36} is a function of the degree and memory of the 
meetows modeis 1,2,...,i-1,i. 

Table 3 represents an example where we consider three 
different paths from the BVM with d=1 and m=1, to the BVM 
with d=4 and m=4. The paths are denoted with arrows and 
Sewersized letters. The order of.complexity involved in this 


example is shown in Table 4. A direct evaluation of the BVM 


with d=4 and m*4 is given for comparison purposes. 
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TABLE 3: 







Flow of four growth paths through the chart of the 
number of coefficients in a BVM of degree d and memory mn. 


Size of Matrix 
to be inverted 


Gea) 






Inversion 


Operations 


(a(i)]°/3 





Pathii 
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Sin 
2 
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2 

3 

4 
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6 
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2 

3 
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5 
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De 
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ae, 3 3 
27a.) 20 17 
isan) | ite 99 
(4,4) 14 By 5, 
Ctals) 3 5 
wile oe 5 2 
Ct 5) 7 2 
(1 ade) Q 2 
(2, #) 54 45 
(3 4) 219 165 
(4, @) FAG 495 
it at) 3 3 
Ca) 9 6 
oe) 19 ee 
aa ) 34 15 
(m2) 125 91 
Cae ®, 329 204 
a4 ) 74 385 
(4,4) ap alw Tid 
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334 
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aJOOE 
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~O54E+7 


-196E+7 
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Paths A, B, and C each result in a lower total 
computational complexity than the direct evaluation of the 
BVM(4,4) model described by path D. Other paths are 
possidle bdSBUt this example is representative of the 
computational savings that result from the use of the 
recursive algorithm. 

The development of the recursive algorithm is based on 
three assumptions. 

(1) All model equations are linear in their respective 

eoetticikent vectors. 

(2) The equation of the Ca model includes all of 

the terms contained in the Gai model, plus some new 

terms. This is Taine mathematically in equations 

{4.9}, {4.10}, ana {4.14}. 

(3) The determinant of A(i/i-1) is not zero. 

We explicitly avoided any assumptions on the 


st. 
relationship between w(n,i-1), the terms in the (i-1) 


th 
model, and w(n,i/i-1), the new terms appearing in the (i) 


model. This results ina general recursive solution 
aigorithm that is applicable for any type of model growth we 


care to Sonsader ’. The following chapter shows that the 


7 We ee lease the limitation on the form of each tern 
omet We defimed in iq. {3.6} for continuity of presentation, 
but mention here that other functional forms besides integer 
products of observations could be used as long as the 
resulting model equation is still linear in the unknown 
coec¢fichents. 
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existing "recursive-in-order" algorithms (e.g. Levinson's 
[Ref. 2 = 4&4 and 20 —- 25) and Lattice [Ref. 5 = 8 and 39] 
are special cases of the recursive evaluation equations 
presented here. 

Chapter VY develops several new techniques for specifying 
poss2ble model term vectors w(n,i/i-1); for recursive growth 
using the BVM. These are less restrictive than existing 
techniques, and allow for more accurate and compact modeling 


of typical systems of interest. 
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V. TECHNIQUES FOR GROWING MODELS 


A. OVERVIEW OF MODEL GROWTH 

Tire “Objyeetive of our analysis is to determine patterns 
or other key behavior properties from the measured data, and 
wse this information to efficiently formulate a suitable 
mathematical model. This model relationship is evaluated 
Segarnse betn its ability to predict behavior of the output 
time series within Some reasonable and statistically 
quantifyable degree of accuracy, and its compactness of form. 

Earlier work in model development was generally limited 
to an assumed linear relationship, and started with 
techniques like harmonic analysis and mathematical transform 
ereory (Ref. 32, 33 and 34]. In the late 1960's, time 
Ber bes Statistical analysis methods were developed by Box 
and Jenkins [Ref. 17]. These methods are related to 
transformations of the spectral methods, and approach the 
characterization problem from the different perspective of 
prediction form models [Ref. 35]. These techniques are not 
closed form solutions to the system characterization 
problem, but are instead multistage approaches that have 
been widely used for the time series analysis of real world 
systems (Ref. 36]. 

The Box and Jenkins technique asSumes a general class of 


time series models which has been found, experimentally, to 
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pesenetremely rich. The procedure oe es as a trial-and- 
Baemom procegs with decision points where the analyst is 
required to select the next step based on the available 
information [{Ref. 37]. 

sniee@ 6Gxasting linear time series techniques are not 
ebosed form solution algorithms permitting full analysis 
without human intervention and decisions, it would not be 
Surprising that we are unable to find a complete closed form 
algorithm for the more general nonlinear relationship case. 
But monlinear systems characterization is interesting and 
maportcant, and a solution is still worth pursuing. 

Chapter IV introduced a general, recurSive set of 
SGuations for efficiently evaluating related sets of model 
equations. Efficiently handling the system characterization 
problem requires a method for determining what ee Se 
permis) tO B@dd at @ach iteration; how to "grow" the model. 

This chapter discusSes existing techniques for recursive 
model growth (e.g. "recursive-in-order") which have been 
applied to some linear and nonlinear systems. Six variants 
on this type of "block=-form" technique are developed for the 


more general BVM form, and the capabilities and limitations 


of all these techniques are investigated. 


Be EXTSTOaWG TECHNIQUES FOR MODEL GROWTH 
Time syatems identification literature [Ref. 2 - 8, 20 = 
ao, so ame sd) Contains two different techniques for both 


specifying and recursively estimating model coefficients, 
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maa MO @xplicit teehniques for just the recursive evaluation 
of model fitting error. Both techniques are based on the 
concept of considering new model terms that have a unique 
"Increasing order" relationship to existing model terms, and 
take advantage of the special Toeplitz matrix structure that 
can be made to occur in the resulting equations for the 
coefficient estimates [Ref. 2 - 4 and 20 = 25]. This 
moeeplitz structure is limited to a restrictive class of 
models, and requires the use of the Autocorrelation error 
Minimization method. The iterative solution technique is 
weed On Durbin's Simpliciation of Levinson's algorithm 
[Ref. 2], and is well known in the literature. This 
technique is a special restrictive case of the general 
recursive algorithm presented in Chapter IV, and the details 
of the relationship are given in Appendix B. 

Theorem 1 and Experiment 1 show that the use of the 
Autocorrelation method typically produces a suboptimal fit 
when the data sequences are finite. In terms of nonlinear 
models, the requirement for Toeplitz (or even Block- 
Toeplitz) structure for the least squares matrix severely 
limits the choice of allowable models. The "regular-form" 
kernel Nonlinear ARMA model used by Perry (Ref. 8] is a 
EYypDical example of a restricted choice of terms in the 


model. This is discussed again in the next section. 


LOT. 





twe second recursive coefficient estimation and growth 
veennique is based on "Lattice-filtering" [Ref. 5 - 8, 38 
and 39]. This is a prediction-error version of Levinson's 
algorithm using a lattice structure implementation rather 
than the more conventional tapped delay line type of 
implementation. The error residual signal after each stage 
of the lattice has converged, is used in the computation of 
the lattice parameter estimates and signals used in 
subsequent stages. The lattice technique is a special 
implementation of the general algorithm of Chapter IV which 
has limited applicability. This technique offers no 
advantages for the unrestricted model growth we wish to 


eonsider. 


C. “RECURSIVE MODEL GROWTH WITH THE BVM 

Chapter II introduced the BVM and showed that it 
subsumes the MA, AR, ARMA, Volterra, and Bilinear model 
forms. The recursive model evaluation and coefficient 
estimation algorithm presented in Chapter IV can be used for 
efficient and meaningful model growth of any of these model 
forms. 

This section presents six extensions of the recursive- 
in-order techniques which apply directly to the general BVM 
form. In all cases, the first model is evaluated by a 
direct least squares fit using Eq {4.1} through Eq. {4.6}. 


Subsequent models are evaluated using the recursive 
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Perea s oresented in Eq {4.8} through {4.37}. We 
meatricted the upper limit of the lag on the model terms to 
be €gqual to the memory m for both the input and output 
mens. for clarity of presentation. This restriction is 
removed in the model growth technique presented in the next 
chapter. 

[re first growth technique starts with the base model, 
BVM(d,m) = BVM(1,1) and uses the following fixed procedure. 
fore Hitting error J(d.,m) = J(1,1) is evaluated for this 
first model, and for subsequent models BVM(1,1+i) where 
Sour er, ... Weta JC 7, i) stops decreasing significantly 
(the meaning of which will be discussed later). This last 
significant model is denoted as the new base model BVM(1,m). 
The fitting error J(1+j,m) is evaluated for subsequent 
momels am i+j},m) where j=1,2,3,-... until J(1+j,m) stops 
decreasing significantly. This last model is denoted as the 
new base model BVM(d,m) and the iteration starts on the 
memory m again. This two-phase iteration is continued as 
Vonm@ as«imeaningful reduction in fitting error is obtained. 
This search strategy is denoted as the "M Directed Growth" 
because the initial phase involves growth in memory m (Fig. 
aN.a). 

A second growth technique involves a similar algorithm 
with the difference that the first phase iteration is on the 
degree d (Fig. 21.b). This technique is therefore denoted 


ao @D Direeted Growth". The choice of an appropriate 
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Significance test for switching between the two phases of 
these directed growth algorithms remains an open question. 

A third growth method is denoted as "Diagonal Growth", 
and again starts with the evaluation of the base model 
BVM(d,m) = BVM(1,1). The fitting error of successive models 
PM ink i) for i=1,2,3,... is evaluated until J(1+i,1+i) 


ia aicigepe ably stall (Fig. 21.c). 


i! Zz 3 © al 2 3 4 12 es 4 
d d 
1 1 1 
2 2 2 
3 3 3 
4 4 4 
m 7 m 
Figure 21.a Figure 21.b Figure 21.c 


BIGURE 21: Three Growth Methods for the BVM 


A fourth technique is denoted as "M-D Zig-Zag Growth” 
and starts with the evaluation of the base model BVM(1,1). 
The memory m and degree d are alternately incremented until 
the fitting error of the resulting model is acceptable (Fig. 
Dome). 

A fifth technique denoted as "D-M Zig-Zag Growth" 
imvolves a similar algorithm with the key difference that 
the first vhase iteration is on the degree d (Fig. 22.b). 

A sixth growth strategy is denoted as “Neighbor Growth’. 


It starts with the base model BVM(1,1) and evaluates its 





fitting error J(1,1). Next we evaluate models that differ 
from the base model by one increment of degree, one 
increment of memory, or both. in this starting case we 
evaluate J(1,2), J(2,1), and J(2,2). We denote the model 
with the lowest fitting error J as the new base model and 
continue this iteration process until the decrease in J is 


no long@r significant (Piz. 22.c). 


ies @ 4 pe oe ges ton 8 a 
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FIGURE 22: Three Additional Growth Methods for the BVM 


These six techniques all fall under the general type of 
growth we refer to as "“block-form" techniques. At each 
iteration the model growth is accomplished by automatically 
including one of a predetermined set of model terms. 

One other model growth technique has been proposed by 
Perry [Ref. Ba). A special and restricted form of the 
nonlinear ARMA model (earlier version of the BVM) and the 


Autocorrelation error minimization method are used to 


fevwe'op a apecial multichannel lattice form parameter 





estimation solution. Reference 8 does not contain any 
experimental verification of this technique. We analyze 
this technique to show its intrinsic weaknesses. 

A quadratic nonlinear ARMA model of this special form 
can be written as shown below (a translation of Ref 5, pp. 
193, Eq {4.29} into the notation of this thesis). 


M M M 

il 2 =k 
y(n) » 9129 (hyu(n-hy) - y ». 94.9 (hy ho)u(n-h,)u(n-h,-h,) 
h,=0 h,=0 h,=0 


M M M 
1 3 1 
: 370 120 


ik 
7 >. 94,,(h,.n,)u(a-h,)y(n-1-h,-h,) + e(n) ieee, 


THE proposed Yeehnique of Reference 8 requires that the 


user prespecify the integer values of the upper summation 


limits Mos Mas and M)- Tren Eq. {5.1} becomes a recursive- 
in-order My model equation that can partially evaluated by 
least squares lattice techniques. Some of the model terms 


Somnowerto Wire GFestrictions of the lattice solution and must 
be evaluated separately by direct least squares. Despite 
feo atwraceiveness of the potentially efficient lattice 
form, the requirement to prespecify all but one of the upper 
Summation limits of Eq {5.1} excessively complicates any 
Systematic model growth with this method. Reference 8 does 
not provide any suggestions on how to select these upper 


ieee emcee this technique automatically involves a fixed 





mee Of Cerms at each iteration, it falls under our 
Gerimnitron as a block-~form technique. It basically is an 
attempt at fitting a problem (parameter estimation) to a 
parcaeculer form of solution (lattice form recursive-in- 
Ordgem).,@and is inferior to the growth methods of this 
chapter and the following. 

These block-form techniques can all eventually subsume 
any system of the BVM form. They unfortunately require a 
Significant amount of computations as the degree and memory 
increase (See section VI.D). The first five techniques are 
all nonlinear extension of the linear recursive-in-order 
concept discussed earlier, and are brute-force methods. The 
Neighbor Growth technique offers an appealing approach for 
potentially more efficient model growth, but also suffers 
from the problem of high computational cost when used to r 
evaluate the addition of a large number of model terms. We 


continue the discussion of efficient model growth along a 


related line in the next chapter. 
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VI- SEARCH TNDICATORS FOR EFFICIENT MODEL GROWTH 


S. DSSCUSSION 

The block=form growth techniques of Chapter V follow 
from the conventional concept of model growth in linear 
Systems. Closer examination reveals that these techniques 
have a number of serious flaws when used for nonlinear model 
grown . Given a finite amount of measurement data, we can 
only consider evaluating models when the number of model 
terms is less than or equal to the number of data samples. 
Table 1 reveals that many of these nonlinear models can only 
be evaluated with extremely long sequences of data 
measurements. For example, the model BVM(3,5) has 363 
different terms. Only a few of these terms may be needed in 
modeling any third degree dynamic system whose equation 
involves the factor u(n=-5) or y(n-5). However, all of these 
363 terms are involved in the full model evaluation when the 
block form growth techniques are used, and sufficient data 
m@aSurements must be available. The following points 
Summarize the results of many computer simulated model 
growth experiments (See Chapter VII later). 

As the degree or memory increase, all block-form 
modeling techniques automatically consider an increasing 
number of terms at each subsequent growth iteration. Tags 


Pemmirts in Papidly increasing computational cost, and often 





vroduces an ill-conditioned least squares matrix A(i) due to 
the inclusion of several model terms with nearly equivalent 
peeperures im terms of vaiwes in this matrix. By ill- 
conditioned, we mean the condition number (the ratio of the 
largest to smallest eigenvalue) is numerically large Cenae 
greater than 10000). 

A higher condition number for matrix Afi) is related to 
less accurate estimates for the coefficients p(i), and 
higher fitting error aD) . In some cases the matrix A(i) 
becomes so ill-conditioned that it is no longer positive 
definite, and the resulting model evaluation is no longer 
optimum in any least squares sense. It has been 
experimentally verified that the general use of these block- 
form modeling techniques often produces poor results for 
noniinear systems; namely offset model coefficient 
estimates, high fitting error, and the inclusion of terms 
that are not actually needed. An example is provided in 
Ciepter Vil. 

One possible approach to overcoming the preceding 
probiems is to start with some base model such as BVM(1,1), 
and use one of the block-form techniques just to specify a 
new set of q(i) model terms. Instead of evaiuating these 
candidate terms as a set, assume that only a subset of them 
1s actually needed. Pwempronplen= is how to find the 
Particular subset of these terms required in the final 


model. 


ily ie) 





hits approach is related to standard stepwise regression 
analysis [Ref. 40], and also to a recently published 
technique known as GMDH, the Group Method of Data Handling 
(Ref. 41ijJ. These preceding techniques are general enough to 
permit consideration of a wide variety of model terms and 
Demm Can avord the ill-conditioned solution, but they still 
have the following major problem. With q(i) potential new 
model terms, there are ee possible model equations to 
consider. Except for small values of q(i), the exhaustive 
evaluation of each of the corresponding solution equations 
rapidly becomes prohibitively expensive. 

iene Pe ete ae@dqitional problem of a stopping criterion. 
Examination of Eq. {4.33} and Eq. {4.36} shows that the 
fittuwme orror roe) is monotonically decreasing when new 
model terms are added and matrix a is positive 
definite. Therefore, while only some number r out of these 
particular q(i) candidate model terms may be needed in the 
fim model, the fitting error with r+1 terms will still be 
lower (with the exception of numerical errors or an exact 
model fit). 

head) ear aeercal Situations, we have only finite 
measurement data and finite computer resources, and are left 
WitUn an interesting and not unusual problen. When the 
Oreceding model growth techniques vied ill-conditioned 
Seumitmomeamor waerease to a point (1) greater than the finite 


data can handle, or (2) beyond the computational resources; 
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are there any prudent procedures that we can employ? The 
field of artificial intelligence has provided some 
motivation in related problems including chess playing 
programs and voice-recognition methods. The basis for 
comparison is the intractability of the exhaustive solution 
when there is only finite data, time, and computational 
power. Since the performance modeling problem is 
inverestings, and has practtieal applications, we develop a 
semi-heuristic technique to follow when there are not enough 
pesourec®s’ Por the exhaustive solution. 

This chapter presents the new concept of “search 
Emdicators” for efficiently growing a model of an unknown 
linear or nonlinear system from a finite set of input-output 
measurements. Rather than attempting to solve the typically 
large set of normal equations for all of the new candidate 
model terms, the proposed concept uses an easily computable 
Search indteator (scalar value), or a set of such search 
indicators, for each candidate model term contained in 
ie) = eectmeacnegrowthn iteration i. The relative values 
of these search indicators are used to systematically 


8 : wae 
exclude those terms expected to have insignificant effect 





8 The word expected is used to acknowledge the heuristic 
Mature of seme of these search indicators and of their use. 
We have been unable to prove that any technique based on 
Giver wse 1s girmearanteed to pick the optimum model terms at 
eaeh growen ltvermation. These indicators are logical factors 
based on the recursive model evaluation equations, and the 
results of many experiments show that techniques using some 
search indicators provide for highly efficient model growth. 
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on reducing the fitting error. The remaining term or terms 

aye retaiwed in the vector w(n,i/i-1) for this Go model, 

and used to form a much smaller set of normal equations that 

are efficiently evaluated by the general recursive equations 

presented in Chapter IV. 

This proposed two-phase concept offers a number of 

improved capabilities over the existing growth techniques. 
(1) Since we compute the search indicators for each 
candidate model term seperately, we can consider 
more potential model terms than the number of data 
measurements. Nie a result, the terms of nonlinear 
models with large degree and memory can now be 
considered. We must, of course, eliminate enough 
terms such that the reduced model form evaluated in 
the second phase has fewer unknowns than the number 
of data measurements. 
(2) This technique allows the evaluation of widely 
different model terms at any iteration. Unlike the 
recursive-in-order (or more general block-form) 
techniques, there is no longer the implicit 
Peterietion that the current "model contain all of 
the possible set of input, output, and bivariate 
terms specified by a particular degree and memory. 
(3) The initial set of model terms w(n,1) can be 
Detceer chosen by the search indicator concept, 


rather than by blindly picking a predefined base 
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model like BVM(1,1). The computer Simulated 
Sx peiwements Of Chapter VII show that this property 
allows for the efficient characterization of a 
gemeral class of systems having an input-output 
delay L (e.g. where terms containing the factor 
men=<) for K=0,1,2,...,L—-1 are not needed in the 
final model). In cases where the system under 
consideration has a delay factor L, the block-form 
techniques fail to recognize and exploit this 
property, and often converge on amore complex 
model. 
(4) The search indicator technique selects one or 
more metas model terms in the first phase, 
produces a much smaller matrix A(i/i-1), and 
tmeremore significantly reduces the computational 
burden. It 1s also capable of efficiently handling 
the previously discussed problem of ill-conditioning 
caused by nearly equivalent model terms. 
Timese femmeures are demonstrated in the following sections. 
The next section defines various possible search 
indicators based on the signals, vectors, and matrices 
contained in the recursive model growth solution and 
evaluation equations introduced in Chapter IV. some 
Bageical MNterpretation is given for each of these search 
indicators, and the set is reduced to a smaller set worthy 


Sieur enrer investization. Results of many computer 
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Simulated experiments have shown the superior growth 
capabilities of the proposed concept, and confirmed that the 
S@aren indicator technique provides significant 
computational savings and accuracy improvements over all of 
the previously discussed growth techniques. Examples of 
model growth are provided in the computer simulated and real 


world experiments of Chapter VII. 


Se DEVELOPMENT OF SEARCH INDICATORS 

Thie’ no tat ion wj(n,i/i-1) is used to represent the Cpe 
Model term (out of the candidate set of terms) that we 
Gonsider adding at the a iteration, given that we have 
previously evaluated and accepted a model at the fai: 
Merptactrouw, We let q(i) still represent the number of 
candidate model terms considered at the aa iteration, and 
therefore Wee 3, ce GK). 

The following development is partially based on the 
yowmcnLon for tire Signals, vectors, and matrices contained 
Aeewin vmesrecursive solution and evaluation equations of 
Chieipeer IV. One important note of clarification needs to be 
made al Cites BOlnt tO Minimize potential confusion. 

Mie sec Of equations {4.8} through {4.38} in Chapter IV 
was developed to evaluate the improvement in model fitting 
Cmneonwanad Camroulbave™ the new coefficient estimates based on 
adding the entire candidate set of terms w(n,i/i-1) to the 


model with the existing set of terms w(n,i-1). The set of 


Search indicators developed in this chapter is to be 


ae 





calculated for each of the candidate model terms w5(n,i/i-1) 
in the candidate set w(n,i/i-1). These indicators are 
designed to each give some partial metric or measure for the 
improvement in model fitting error. As a result of this 
development, many of the matrices and vectors defined in Eq. 
mo) Chrough Ba. {4.38} for the evaluation of multiple 
Model terms a@re used in this chapter in a reduced form (e.g. 
vectors and scalars, respectively) for the search indicator 
evaluation of each term. Whenever possible we use the lower 
case vector version of the matrix designation to represent 
the corresponding reduced form vector (e.g. wj(i/i-1) for 
WCi/i-1)). Likewise we use the scalar representation to 
describe the corresponding reduced form of a vector (e.g. 
n;(i/i-1) meme Ci /i-1)). 

‘Pirese reductions are made only for clarity in the 
development of "ene seareh indiweators. Once a subset of 
model terms is selected by the sear¢ch indicators, the full 
form equations of Chapter IV are used to evaluate the 
fitting error and coefficient estimates. PGels noted: 
however, that some of the factors calculated in the 
evaluation of the search indicators for the candidate model 
terms can be used again in the actual evaluation of the 
model performance. Thus, some of these computations will 
serve double duty. 

™— pelea yecomoern is efficiency of computation, so the 


numerical complexity (number of multiplications and 
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Gey rorons) involved in calculating each search indicator has 
been analyzed. The notational convention O(N) will be used 
Bo Gemote Eimultiplication or division operations. This 
complexity notation is ineluded with the development of each 
search indicator, and summarized with examples in Table 5 
ema@ Pable 6 after the development of all of the indicators. 
Démote the size c(i-1) of the (i-1)°" medel as PP, anid 
the number of data points in the error minimization as N. 
Since we have completed the the evaluation of the Gusa es 


model, the following matrices and vectors are available. 


WCi=e1) © a NW x P matrix given by Eq. {4.23} 


ACi-1) Qaeep x P Wetrix given by Eq. {4.25} 


a P x 1 column vector given by Eq. {4.28} 


Hci=1) 
, silee. a 
We also have A(i-1) amid pCi=1), the coefficient vector. 
Some preliminary vectors needed for the development of 


& 
tre searen indtcators are presented at this point. 


oie Ht oe a: ee 
wiki /i-1) Cwz(n,,i/i-1),wj(n,+1,i/i-1),....wy(n de eal Os 


es) 

= a#H<x 1 transposed vector of the signal specified 
by the a candidate model term over the 
inwerval (nj .n,). This is the reduced version of 
Pilewdava matrix wWCt/i-1) given by Eq. {4.23}, in 


th 
the case of just the (j) candidate model term. 


e(n,i-1) 


y(n) - w(n,i-1) p(i-1) tor n,<en<=n, Ouse} 
= value of the error residual at discrete time n 


C 
Premethe (i-i) model iteration. This can be 


computed with P multiplications per point. 


i222 





Seen) 


[ e(n,,i-1), SiGe ew 1". s e(n,,i-1) ] {6.3} 


Jz 


= a N x 1 column vector of the residual sequence 


st 
Vawes “rom the (i-1) iGerabion over the 


interval (n,.n.). tmsoereduires PN multiplications. 
Z gh 
Deect/i=)1)  S 1 WHi-1) w-(i/iel) {6.4} 


—meues Fregueed yerston of matrix B(i/i-1) given by 
Eq. {4.26} in the case of just the ap candidate 
Memes tCcrm@: Since this is a P x N matrix times a 
Nox) 1 COljmn vector, the cost is O(PN+1). 


SMG: =1) 2b (i/i-1) (6.5) 


Sey) 
= the reduced version of matrix F(i) given by 

Eq. {4.30} in the case of just the nae candidate 
mo@el £erm. Since ee i is a P x P matrix that 
we already computed, and Bb Ci/i-1) Po a. Pax a 
column vector obtained in Eq. {6.4} at a cost of 
O(PN+1), the total cost of computing £j(17i-1) is 
0(P*) + OC PNY1) = OCP tis 1). 

Twelve different search indicators were developed and 
Gexamined in this work. The initial set of search indicators 
I(j,1) through I(j,8) was developed from an algebraic 
perspective; i.e. these relationships arose from an 
examination of the general recursive evaluation equations of 
Chapter IV, under the condition of adding a Single new model 
term w,(n,i/i-1). Each search indicator therefore has a 
dijmeee Felacwonanip to the actual system characterization 


experiment. 
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Biwe Mirst seamch indicator is the time average of the 
ial 
product of the signai specified by the (j) candidate model 


term, and the output signal of the systen. 


MA 
Neen T=)" x dr > w,(n,i/i-1 )y(n) {6.6} 
N N n=n. 


Since w<i/i-1) ees ievector audyy is a N x i vector, 


Ae) 


the caiculation of I(j,1) requires O(N+i1) operations for 
each candidate model tern. 

This indicator corresponds to the scalar version 
h,(i/i-1) of the vector h(i/i-1) defined by Eq. (ag29 
While intuitiveiy appealing as the “empirical” cross- 
coprelation between the output of the system under test and 
the signal specified by the candidate model cae 1 oe 
Indicator Nas a basic flaw. It is a function only of the 
output of the system and the candidate model term, and as 
such, does not depend on the particular terms in the 
previous model. Numerous computer simulated experiments 
have verified that I(j,i), taken alone, is unsuitable as a 
reliabie search indicator for model growth. 

The second search indicator is the value corresponding 
to the reduced version g (i) of the vector g(i) of 
Bq). 4 a in the case of just the Ge candidate model 
tern. 


(3,2) = 4 w,(i/i-t) y + £,(4/4-1) h(i-1) 


N 
=r Caai) £,(4/i-1) n(4-1) Go 7) 


where ® (i/i-1) is a N x 1 vector and y is a N x 1 vector. 
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Since BE (i7/i-1) ie &@ Pow 1 vector obtained at the cost 
o(p*+pN+1), and h(i-1) is a P x 1 vector, the calculation of 
I(j,2) requires a total of 0(P7+PN+P+N+2) operations for 
each candidate model tern. 
We digress momentarily to examine some of the 
characteristics of the full-vector g(i) given by Eq. {4.32}. 
SVpstitwting {4.26} and 14 . 219 | 11'%.0 {4.32} produces; 


ee = 1 GEN) & = 3) 6a/2-19 W02=1) p(2-1) 
N N 


U 
ie 


eet pe 1) pG-1) | 


a 


mF nh a1) 2 (4-1) 


if 
= 1 e(i-1) W(i/i-1) {6.8} 


Since {e(n,i-1)} is the prediction error sequence of the 
st 

( a4 ) model, and the vector e(i-1) contains the values of 
th 

fe(n,i-1)}, we see that g(i) is a vector whose (j) element 

; ot 0 
is the normalized inner product of e(i-1) amd the (j) 
coiumn of W(i/i-1). Examination of Eq. {4.23} and Eq. {6.1} 
th | 
reveals that the (j) column of W(i/i-1) is the vector 


WAa/i-1), and yields the following expression. 


g.(i) = 1 e(4-1) a (a/i-1) 


i) 
i 


& see) 


j 1 e(i-1) wy (i/i-1) 


N 


ay 
S a(x) $2? = ised 1 ) Ps ee, 


N 


{6.9} 


W223) 





where Cog . [ g, (i), Se re ef) pee rigiae, < 2) Nniot OF 
and where q(i) = c(i) - e(i-1) | Gemtale 
Examination of Eq. {6.9} shows that the value g;, (i) is 
the time average of the product of the error residual signal 
and a signal formed by products and powers of products, of 
the input-output measurements corresponding to the 
specification of the Cpe: candidate model term. This gives 
physical interpretation and increased meaning to the value 
g (i), which is contained in Eq. {6.7} (search indicator 
two), and equivalently in Eq. {6.12} below as search 
Maareakitor tivmmee. 


eo) = 


T ny 
1 Me GA / 4 = 1.) este) ae ye eo, 27 2-1 Jen, 2 1)) = 2. Ch) 
- = - = j 

N N n=n, : 


Gtees 

Since w;(i/i-1) is a N x 1 vector and e(i-1) is a N x 1 
vector obtained at the cost O(PN), the calculation of 
TC 53") requires O( PN+N+1 ) Sel nee ome for the first candidate 
model term. For the second and subsequent candidate model 
terms the cost is reduced to O(N+1) since e(i-1) has aiready 
been calculated. Tabies 5 and 6 show that I(j,3) can be 
computed much more efficiently than I(j,2). It is also 
intuitively appealing to be using the error residual from 
the previous model in evaluating the usefulness of a new 
candidate model tern. 

ie the °° model produces an exact match to the 
measured input-output data {u(n)} and fy(n)}, then g,(i) is 


zero. This occurs regardless of the choice of the new tern 
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th 
wm (nya /i-1) Gonsidered for inclusion in the (i) model. It 
follows that the absolute value of g;(k) should be a useful 
MeeeGure attamy step k < iin the growth iteration. Gas 
conjectured that this represents a measure of the relative 
benefit of that particular model term as compared with other 
poesitBie choicés of terms. In this regard the term that 
produces the iargest absolute value of g ; (ic) would also 
probably result in the smallest value of ator the error 
fipeaee Criteria. This last point remains to be 
demonstrated. 

The above discussion indicates that I(j,3) should be a 
pewemclalliy geod s@a@rch indicator, either alone, or in 
Sembination with other factors. We will later consider 
other search indicators based on g,(i). 

The fourth search indicator I(j,4) is the time average 
of the square of the signal specified by the (aia candidate 


model term. 


2 
4s n 3 
ee Bee (G7 i~ 1) wy a(i/i-1) = 1 yy fw; (n,i/i-1)] {6.13} 
= = = j - j 
N N n=n 
Z 
Siafce Bi. i/i-#) is a N x 1 vector, the calculation of I(j,4) 


requires O(N+1) operations for each candidate model term. 
This corresponds to the reduced version of matrix A(i/i-1) 
given by Eq. {4.25}. It can be efflecmently computed, but 
suffers the same flaws as I(j,1). 

Seem che search indicator is the scalar corresponding 
to the reduced version of matrix G(i) given by Eq. (Age ne aya 


th 
the case of one additional coefficient in the (i) model. 


het 





At 
Tee (i/i-1) £ ,(4/i-1) 


— T _— 
all wae 1) Wen aie mie b, 


N 


Ta 4) b i/i-1) £,C4/i-1) On pa) 
Since b (i/i-1) Usea P x 1 vector, and Ere) ie ae ae 
Vector, the cost of computing £,0i/1-1) is O(P-+PN+1) and 
includes the cost of computing b;Ci/i-b). Therefore the 
Cemeulatrom of 1I(],5) requires a total of O( P< +PN+P+N+2) 
operations for each candidate model term. Examination of 
Eq. {4.33 and {4.36} reveals that the scalar value I(j,5) is 
inversely related to the reduction in the fitting error that 
results if the Single candidate model term is brought into 
the model. As such, there is reason to expect that I(j,5) 
would be a good search indicator, either alone or in 
Comeinaewtion With other factors. Unfortunately, the high 
eest for I(j,.5) precludes its general uSe. 

he sixth search indicator is the scalar value 
[orneapowdme co the reduced version of vector k(i) from Eq. 


{8833} in thre caBe of one additional model term. 


| Ce eee 
— —)} = 
N 
_ ag — ae a rare. 
ee ey) Olde 1) eb -Ci/ian1) f.Ci/i-1) {6.151 
ame —J = J = J 


where Beri )) is @ Wx 1 vector, eC s1) iseaN «x 1 vector 
obtained at the cost O(PN), and f£,0i/7i-1) is a P x 1 vector 
obtained at the cost 0(P-+PN+1) which includes the cost of 


Cempubing the P x 1 vector b,(i/i-1). Therefore, the 
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eetculatiom of I(j3.6) requires O(P7+2PN+2N+P+4) operations 
for the first candidate model term. For the second and 
subsequent candidate model terms the cost is reduced to 

O( P7+PN+2N+P+4) since e(i-~1) has already been calculated. 

Ex@mination of Eq. {4.36} reveals that the value of 
meppooeisa directly related to the reduction in the fitting 
arrer ea) that results from including the candidate term 
in the model. Tables 5 and 6 indicate, however, that there 
is a very high computational cost associated with this 
eaaren madicator. 

The seventh search indicator is the value of the change 
me tme emror eriterion ree isa resudt. Of sineludi mame thie 
Gandidate model term. It is based on Eq. {4..32}, {4.33}, 
ard {68.36}. 


et) Meee 7103.5) 


T T 
1 owy(i/i-t) y + £,(i7i-1) nCi-1) 
N 





1 2 Gy ese cu eas ~ 3a Cl/io) eee eas Gare at 
7 =) = =3 a 


where p,ti7i-1 Pode on lovecvOremy is ya lox 1 yeetor, 
DG 1)ets oP x 1 vector, and £5(i/i-1) isa ex |) vector 
obtained at the cost O(P7+PN+1) Woteneine muides the cost of 
computing the P x 1 vector bjCi/i-1). Therefore the total 
Soemect compucrng I(j),7) requires O( P~+PN+2P+2N+5) 
operations for each candidate model term. 

fiimcmicmune exact value of the reduction in the error 


criterion resulting from including the candidate model term. 
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me saucer, it probably should not be cailed an “indicator”. 
Pees cammed SS a control indicator since it has the 
desired property of exactly describing the Santon 
improvement. Tabies 5 and 6 show that this indicator is 
extremely expensive to compute. We next reduce the 
computational complexity using I(j,3). 

The eighth search indicator is the value of the change 
in “ee Error criveérion re), as a result of including the 
candidate model term and using the error residual signal of 
the model from the previous growth iteration. It is based 
on Eq. (emet., {4.33}, and {4.36}. Mua o mena cde Ghee to lelo wi me 
form; 

a2 
nen) Pg, 5) /105,5) 


f a ket Fe | 
7 = 


2 


iy Pe/s-1) Baste) +b (4/i-1) £;(4/4-1) asl te 
N 


where Mie 1) as) aa NW x 1 vector, e(n-1) is a N x 1 vector 
obtained at the cost O(PN), and fee) is a P x 1 vector 
obtained at the cost o(p*+pn+1 ) which includes the cost of 
computing the P x 1 vector my Bo |r Therefore the 
caiculation of I(j,8) requires 0(p7+2PN+2N+P+5) operations 
for the first candidate model term. For the second and 
Subsequent candidate model terms the cost is reduced to 
0(p7+PN+2N+P+5) since e(i-1) has already been calculated. 
Pome 1S Cme efact value of the reduction in the error 


criterion resulting from including the candidate term, using 
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the alternate and less costly computation for g (1) 
Geseuosed previously. Unfortunately the cost of computing 
the denominator of Eq. {6.17} predominates, and we have an 
ealtermetive, but still costly, directly related search 
indicator. 

The next three search indicators were developed in an 
attempt to recognize some additional factors that could be 
used to reduce the computational burden of the original set. 
Their physical interpretations are not as clear, but they 
are logical extensions to consider. 

The ninth search indicator is the value of the L2-norm 
of the vector pA@Zi-1) Boemommeby Eq. (6.4), that is; 

T 1/2 
eye) = nord 65, ¢2/7i-1) = [2 (ava) b ,(i7i-1)] L6= 1S} 
Since Begiy = 1) is a P x 1 vector obtained at the cost 
OC PN+1), the calculation of I(j,9) requires O(PN+P+2) 
Operations for each candidate model term. This is the L2- 
norm of a vector composed of time averages between the 
Signals specified by each of the existing model terms and 
the signal specified by the new candidate model term. Since 
this vector corresponds to the reduced version of matrix 
Ot 11) eee ering in Eq. {4.30} through {4.32} and Eq. 
{6.4}, it was conjectured that its length might have some 
Significance. Unfortunately, it also has a high cost and 
therefore offers no advantages. 

The tenth search indicator is the value of the Lé=-norm 


of the vector £,(i/i-1) Sivenmovekq., {6.5}. 
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2 ie 
Pheer) ge nome t,(i/i-1) = [£,(4/4-1) £,(4/1-1)] {6.19} 


Since £,(i/i-1) is a P x 1 vector obtained at the cost 
o(pt+pn+1), the calculation of I(j,10) requires 0(P*+PN+P+2) 
operations for each candidate model term. This indicator is 
the L2-norm of the matrix product of the preceding vector of 
time averages in bj,(i/i-1) and the inverse of the previous 
model least squares matrix A(i-1). This resulting vector 
corresponds to the reduced version of matrix F(i), appearing 
im Eiq:. { amiso}, Eq- 4.34}, and Eq. {4.37}. It was 
conjectured that the length of this vector might have some 
significance to the growth probiem. Tables 5 and 6 show 
that is suffers from a similar high computational cost. 

[ae elewenmth s@arch indicator is the inner product 
om the yeetors bj(i/i-1) and £,(i/i-1). 
roj,at) = ae foe tal) {6.20} 
This value appears in the calcuiation of matrix G(i) in Eq. 
{4.31} and aiso in I(j,5). Since Ge iets a Pox | 
vector obtained at the cost 0(P*+PN+1 ) which inciudes the 
cost of computing the P x 1 vector ANGY oF the 
calculation of I(j,11) requires 0(P*+PN+P+1 ) operations for 
each candidate model term. This second group of search 
indicators I(j,9) through I(j,11) do not appear to offer any 
advantages over the first group of indicators. 

At this point we will leave the domain of proven results 

and use experimental analysis to develop other search 


meaiecators for the model growth problem. We provide 


er 





mathematical justification wherever possible, but these are 
the resuits of mathematical rationalization based on 
experimental findings. The following factor is the main 
result obtained after many detailed experiments using 
computer simulated systems and a controlled input sequence. 
the tweifth search indicator is defined as the ratio of 
Pere) squate of the vaiues of the third search indicator, 
@evided by the fourth search indicator. 
($12) =) 26453)" /1(4,4) {6.21} 
This twelfth indicator was experimentally developed as a 
heuristic compromise to the computational and performance 
limitations of some of the preceding indicators. One 
Srp lanation of thélm@éaning for this search indicator is 
described beiow. 

The improvement in the fitting error resulting from the 
invoivement of just the a” candidate model term is 
defined as Berei/ii-1 ), and can be obtained from kq. {4.36} by 
reducing this general vector equation to its simpler one- 
term model form. SLace g(i) and k(i) become ge; (i) and 
ky (i), respectively, we obtain; 

Jo (i/i-1) Err) - 3°(4=1) = [e(a) x(a) - g, (i) k;(i) 16.22] 
Substituting Eq. {4.33} phi (6. {6.22} wields: 

2(s/s-1) = g,(4)6,(4) 8,(4) = Ce, (a)]?/6,(4) 16.23] 
sao stivalting Eq. {6.4} and {6.5} Lac oO {6.14} produces 
another expression for the reduced version of the 


matrix G(i) to the scalar G,(i). 


1 es 





Cet) = 1 w(4/4-1) Ww (4/4-1) 
“2 wy (4/421) WOH ACG)? WOT) wy (4/421) 
N 
T =) T 
mm me tyt-1) | tT = wk i-1)A(4-1) W(i-1) ] w (1/i-1) 
N 
. 1 (4/421) B(4-1) ee) {6.24} 
N 
where H(i-1) = [ I - (Cte en) ie ee ] {6.25} 


The matrix H(i-1) is a function of the preceding model, 
and not a function of the candidate model term. Therefore 
it can be considered as a constant scaling factor for each 
candidate term evaluation at any model iteration step. 
Matrix H(i-1) is positive semi-definite since the scalar 
Get) cannot be negative, and H(i-1) is also idempotent. 
Since G (i) is a quadratic form, we can use a quadratic 
identity [Ref. 18, pp. 254], and write it as: 


G (1) = trace [1 w.fi/i-1) “hafta ne Hiz=1) | {6.26} 
Y act = Jj 


After many attempts, we are still unabie to reduce Hq. 
{6.26} to a form that can be more efficientiy computed. 
Based on the properties of matrix H(i-1), and the heuristic 
belief that the trace of the matrix in the square brackets 
of @q@. {6.26} is an t@portant factor to consider, we make 
hae fTolLOsine approximation. Justification for this 
approximation wili be given in a subsequent theorem. Using 
Eq. {6.24}, {6.25}, and {6.26}, avproximate G, (i) by its 


; a 
maximum vaiue, G,(i)- 
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wyGQ/i-t) Wyte") (6.27) 


+ 
Gj,(i) = max Ga) = trace [ j 


=| 


Same ysis Of Eq. {6.27} and Eq. {6.13} lead to the 

recognition that the trace of the matrix in square brackets 
pe ey {6527} e€qWals search indicator I(j,4). 

mer) = C5, 4) {6.28} 
Bemeictatuting Eq. {6.28} into Eq. {6.23}, and using 


EGjas) for SF oe results in the new search indicator; 





1 warientea-n]? 
N 
2 
(4 © eer o> ane 
ry.) 


wy Ci/i-1)" wy (44-1) 
(6.29) 


af 

N 
Since WyCi/i-1) mea i X | vector, anid e(i-1) is ai x | 
Weetor Obtained at the cost O(PN), the calculation of 
I(j,12) requires O(PN+2N+4) operations for the first 
candidate model term. For the second and subsequent 
candidate model terms the cost is reduced to O(2N+4) since 
Soe wicemoncady Ween calculated. “Note that I(j,12) is a 
normalized version of the square of g,(i), and therefore 
BWould be a better indicator than I(j,3). It is also 
Gifeliaiper to calculate than I(j,8). These preceding order of 


complexity equations appear in Table 5 and Table 6 along 


with some numerical examples. 


je3.D 





SEARCH N=50 | N=50{N=10Q N=100/N=5001N=500 
INDICATOR COMPLEXITY Pes) P=10 [ Rit) P=50 {|P=5 BESO 


O(N+1) 






O(P*+NP+N+P+2) 
O(NP+N+1) 


O(N+1) 









0(P7+NP+N+P+2) 
0(P7+2NP+2N+P+4) 
O(P*+NP+oN+2P+4) 


0(p7+2NP+2N+P+5) 








O(NP+P+2) 
0(p*+NP+P+2) 
O(p*+NP+P+1) 


O(NP+2N+4) 


TABLE 5: Order of complexity (Number of multiplications or 
divisions) required to compute each search indicator value 
for a singie candidate model term. Various examples of 
model size P and measurement sequence length N are inciuded. 


s@M@e of the factors required in the calculation of the 
search indicator vaiues of the first candidate model term at 
Pee erowene iteration, can be used in the calculation of 
other search indicator vaiues for this term and subdsequent 
modei terms. This can be exploited to produce a lower 
computationai complexity for each candidate model tern 


beyond the first as shown in Table 6. 
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SEARCH N=50 |N=50 JN=100]N=100 
NDICATOR COMPLEXITY Pp=5 | Or = 5 P=50 


O(N+1 ) 


N=500 |N=500 
P=5 









o(pt+NP+N+P+2) 
O(N+1 ) 


O(N+1) 














0 (pt +NP+N+P+2) 
0(P*+NP+2N+P+4) 
0 (p*+NP+2N+2P+4) 
0(P*+NP+2N+P+5) 


O(NP+P+2) 





0(p7+NP+P+2) 
O(p*+Np+pP+1 ) 
0(2N+4) 
TABLE 6: Order of complexity (Number of multiplications or 
divisions) required to compute each search indicator value 
for subsequent model terms beyond the first. Various examples 
of model size P and measurement sequence length N are included. 
Based on the preceding development of I(j,12), we state 


and prove the following theorem. 


THEOREM 2: LOWER BOUND ON REDUCTION IN FITTING ERROR 
I(j,12) is a lower bound on the improvement in the 
rect ime emror resuiting from including the sgingle model term 


ae ed 
ey in, i/3-1) at the (i) growth iteration. 


Von 





PROOF: 
From Eq. {6.12}, I(j,3) = g ,(i). Substituting this into 
Eq. ae 5't yields; 
_—— aga 
J;(i/i-1) “= Maps) / G, (i) {6.30} 
From the development of Eq. {6.24} and Eq. Nomen we see 
that eo) can be written as; 
- 
G, (i) eee Po = I(j,4) - P ; {6.31} 
where P is nonnegative and P; < igi). “iheretomc: 107.4) 
is an upper bound on Gi (i). Applying this last result to 
Bq. {6.30} and Eq. {6.29} yields the result that I(j,12) is 


2 
a Lower bound on J. (i/i-1). 


We have shown how the value of I(j,12) is related to the 
improvement in the error fitting criterion. Tables 5 and 6 
Saow that this search indicator can be computed with a very 
bow co@putational cost. In fact, the cost in Table 6 is not 
a function of the size P of the existing model, only of the 
number N of data measurements. 

The power of this new search indicator is significant. 
The computer simulated and reai world experiments we have 
performed indicate that it is an excellent indicator of the 
Fittimg error improvement that results from inciuding the 
candidate modei term. Because the value of I(j,12) is 
proportional to the square of I(j,3), it rarely happens that 
a term with iow I(j,12) wiil have a significantly large 


vaiue of I(j,8), the actual fitting error improvement. The 
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fact that I(j,12) is easily computed adds to its 
significance. 

For starting the model growth, we can select the subset 
of terms in w(n,1) with the one or two largest values of 
I(j,12). At this first iteration there is no error residual 
Signai since there is no existing model, so we use the total 
model output sequence {y(n)} in place of {e(n,0O)}. While we 
have not been able to prove that this manner of specifying 
site a ii & OT w(n,1) prevents inclusion of unneeded terms, 
results of many experiments show this method provides a good 
set of starting terms and generally yields more compact 
models. 

We have examined the characteristics of search 
indicators I(j,1) through I(j,12) under experimentai 
conditions. This invoived numerous experiments with 
synthesized systems, a controiled input probe sequence, and 
the assumption of no additive output noise. A subsequent 
section examines the robustness of model growth in cases 
where preceding assumptions are relaxed. 

A thirteenth search indicator is designed for a special 
purpose. We previousiy discussed the potential problem of 
nearly equivaient performance from different model terms. 
This ieads to ill-conditioning of the least squares matrix 
am@ Che possibility of multiple solutions. 

The thirteenth search indicator is the maximum result 


chosen from the set of squared and normaiized time averages 


eS) 





obtained from the product of the signal specified by the 
candidate model term, and the signals specified from each of 


the other q(i) candidate model terms at this iteration. 





n 2] 
I(j,13) = maximum v x Mi (Tin / 1-1) eae 1) 
1<=k<=q(i)} IN n=n, 
j#k n3 2 n3 2 
1 Yow,(n,i/i-t) 1} w (n,i/i-1) 
N n=h, N n=n, 16.32} 


Examination of Eq. Panog | and Eq. {6.327 shows that the 
vaiue of I(j,13) equals the maximum ratio of the square of 
each off-diagonal element of the a row of A(i/i-1), and 
the product of the diagonal elements of the corresponding 
column and the a row. im physical terms, large I(j,13) 
means there is the possibility of significant correlation 
between the signal specified by the ee candidate model 
term and the signal specified by another candidate model 
term. This is related to the muitiple ee eee 
coefficient in regression analysis. The following set of 
theorems show that a necessary condition for the least 
Squares solution to represent a unique minimum is that 
ACi/i-1) be positive definite at each growth iteration. 
They also shown how this condition is reiated to the values 
Sf itj,15). Theorem 3 is weil known in the iinear algebra 
an@ Maurie literature, and is included for completeness. 
THEOREM 3: 

The elements of a positive definite matrix D = [4, ] 


Beat satisfy the inequality; 
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2 on 
oh Secl for all j#k {6.33} 


PROOT: 

Assume that the converse of ieee. existed for some din: 
We could interchange the Keo. eceoiumn of D with the ey 
column, the er column with the cy column, the (1)°° 
row with the aye row, and the (oes row with the Ge row 
Methout affecting the definiteness sf D. From the converse 
ot {6.33}, thee @@temminant of this 2 x 2 principle submatrix 
of D would now be less than or equal to zero, and D could 
not be positive definite. Therefore {6.33} is a necessary 
eond 2 t ion. 
THEOREM 4: 

A necessary condition for the matrix A(i/i-1) given by 
Eq. {4.27} to be positive definite, is that the value of 
rie 1B) fom Gatch of the q(i) terms in the aes HPeLreracion 
Must satisfy the foilowing inequality. 

oe a a for all j, j=1,2,..-,qa(i) {6.34} 

PmoOOF: 

From fq. {4.27}, ail diagonal elements of A(i/i-1) are 
nonnegative. From Eq. {6.32} and Theorem 3 we see that 
{6.34} is; ee@mivalent to {6.33}, and therefore {6.34} is a 


necessary condition for A(i/i-1) to be positive definite. 


THEOREM 5: 
A necessary condition for the uniqueness of the soiution 


of the normal equations | Aveo and repeated below; 
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ACi-1) ye Re) : 
ACi)OCi) = ae eee SCs = hi) Orn st 
Bie i= 1) | ACi/i-1) 
Ha Ghat che value of £(j,13) for each of the q(i) terms in 
the eo Lteration must satisfy the following inequality. 
Meyers <1 Pomme! j; jee, ... 790i) {6.36} 
PROOF 
Priome Chapter III, Eq. {3.18} describes the condition 
that the least squares matrix A(i) must be positive definite 
for aa to represent a unique minimum. Ti mati) lsenot 
positive definite, the system of equations given by Eq, 
{6.35} contains more than one set of solutions that 
SPquivalently minimize the fitting error criterion. The 
least squares error minimization may become extremely 
unstable since the minimum will tend to lie on a line or 
surface in parameter space, Faviwer thian-at a point. ~From 
Theorém 4, Eq. {6.34} is a necessary condition that A(Ci/i-1) 
bS positive definite. EGefollows directly trom the proof of 
Theorem 3, that a necessary condition for Ai) to be 
positive definite is that A(i/i-1) is also positive 
definite. Therefore we see that {6.34}, or equivalently 


{6.36} is a necessary condition. 





Other possible search indicators are contained in the 
Set 
area of the patterns in the residual sequence of the (i-1) 
model. We know that when we have completely modeled a 


system using noise free measurements, the error residual 
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should be a random sequence without any trends or patterns. 
ie fact, we expect that this case will produce an error 
residual sequence composed of a series of very short 
segments of alternating sign. It is reasonable to expect 
that the pattern of segments we see in the residual when we 
have undermodeled the system, is representative of the 
missing term(s) in the model. The problem is to learn how 
to decode this information from the patterns in the Gre 


model, to aid us in selecting the missing term or terms. 


C. SSAaRCH INDICATOR GROWTH ALGORITHM 

Our proposed Search Indicator Growth Algorithm is 
represented in Figure 23. We start by specifying a very 
large set of candidate model terms. Our algorithm picks the 
Subset of candidate model terms whose I(j,12) values are 
greater than some specified value of the variable hy (e.g. 
70% of the maximum value of I(j,12) for any candidate term). 
Before adding the selected term(s) to the model for 
Subsequent evaluation of the fitting error, we calculate the 
Vemmetewor 1(j 713) for each selected term using Eq. {6.32}. A 
second heuristic variable h. is used to indicate when 
significant colinearity is present. Values of I(j,13) close 
tore | inereate that the Goo candidate model term (out of 
the selected set) is nearly linearly dependent on another 
Candidate model term. This other term is used in the 


th : 
Gamculation for the (k) ROwewot Aol i= 1 and contributes 


to the large I(j,13). When I(j,13) is greater than Nos we 
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discard the candidate model term of this pair that has the 
Hewer Value of I(j,i2), rej=estimate I[(j,13) for the 
Memalining term, amd continue until all values of I(j,13) are 
sufficiently small (erm. less than 0.85). 

This iterative twoe-phase growth technique 1s based on 
the terms selected by the search indicators, and has a much 
lower computational cost than the complete evaluation of 


fe?) for each combination of possible new terms. 
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STEP 


STEP 


STEP 


STEP 


STEP 


BLGURE 23: 


TO: 






ACCEPT SYSTEM INPUT AND 
OUTPUT MEASUREMENT SEQUENCES. 
SET MODEL ITERATION INDEX i=1. 





CHOOSE SET OF CANDIDATE 
MODEL TERMS FOR ITERATION i. 


COLCUDATE I(3,12) FOR BACH 
CANDIDATE MODEL TERM. 


SELECT SUBSET OF CANDIDATE MODEL 
TERMS WITH I(3,12) GREATER THAN 
SOME SELECTED LIMIT h,. 






GSLCUGATS I(j,13) FOR BACH 
CANDIDATE MODEL TERM IN THE 
ABOVE SUBSET. DETERMINE. 

THE RELATED PAIRS OF TERMS. 







FOR EACH PAIR OF TERMS WHOSE 
I(j,13) VALUE IS GREATER THAN SOME 
SELECTED LIMIT ho, DISCARD THE 
MODEL TERM WITH THE LOWER I(j,12) 
VALUE. CONTINUE UNTIL ALL I(j,13) 
VALUES ARE LESS THAN ho. 






EVALUATE THE RESULTING MODEL 
USENG THE RECURSIVE EVALUATION 
AND SOLUTION EQUATIONS. 






STEP 9: 
TO 


ees(grimeten sv: 





YES 


REALIZE MODEL i AND 
PRODUCE THE ERROR 
RESIDUAL SEQUENCE 









Se © 1% 


Gee ULATS THE VALUES OF ALL MODEL COEFFICIENTS. 
VERIFY THE PREDICTION PERFORMANCE OF THE MODEL 


WITH 
ACCEPTABLE, 


NEW SYSTEM DATA. 
moon GO TO STEP 9. 


STOP IF PERFORMANCE IS 


PagweDimweram of the Search Indicator Growth Algorithm 
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Tremneurdstic variable hy determines the number of terms 
selected for inclusion in the model, and can be set based on 
Che distribution of the values of the search indicator 
ioinebeee Lf there is a grouping of terms with high values 
for I(j,12), they should probably all be accepted into the 
model. If there are only a few terms with high values of 
I(j,12), we should select them all, plus possibly a few more 
“wiatphesidighthy lower values of E(j,12). There is a 
disadvantage of selecting ny too small, since this can 
result in the requirement for extra iterations in order to 
obtain all of the needed terms in the final model. 

The heuristic variable n, determines the amount of 
colinearity allowed between model terms. If chemen too low, 
it will delay or prevent acceptance of actually needed model 
terms that happen to be somewhat correlated with existing 
model terms. If chosen too high, it ee extra terms into 
the model and thereby increase the ill-conditioning of the 
least squares matrix. We have experimentally found the 
range 0.7 <= nh, <= 0.85 to be most effective. 

Tite weet seetion examines the computational cost of 
model growth using the techniques discussed to this point. 
Dee result is that model growth uSing the search indicator 


techniques developed in this chapter offers a new and 


efficient means of obtaining models. 
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D. COMPUTATIONAL COMPARISON OF GROWTH TECHNIQUES 

This section examines the algorithms and computational 
cost associated with the model growth techniques presented 
in this thesis. We use N to denote the number of data 
beens, Cli for fhe number of model terms in iteration i, 
and q(i) for the number of candidate new terms at this 
iteration. Therefore q(i) = e(i)-c(i-1). Each growth 
technique is presented in algorithmic form as a series of 
Steps, and the number of multiplicative or division 
computations required at each step is indicated. The 
details of these order of complexity calculations are based 
on the size of the various matrices, vectors, and sequences 
used in the model growth, and are included in Appendix C for 
the interested reader. ie Computational cost equation is 
formed for a full iteration at the end of each technique, 
and an example is used to emphasize the differences in the 


Zrowth techniques. 
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Techniaque 


Step 1: 
owem 2): 
Siem 3: 
Step 4: 
Sem 5: 


Step 6: 


Seep 7: 


153 Direct Least Squares Computational Cost 


Set im 1, form term vector «<(n,i) 

Form R(i) using Eq. {4.5} [n+1 ]e(i)[c(i)+1]/2 
Form r(i) using Eq. {4.6} e(i)[wet ] 

Invert R(i) Peers |/6 

Solve for mee) using Eq. {4.3} [e(i)**2 ]+c(i)+N+1 


2 
If J (i) < a@eeptable level, stop. 
Else; 
[ee i ~~ i+!|, form a new term 


wector™(m,i). Go to Step 2 


Teeel cost for Stéps 1 through 7 is; 


O(a) = 


[e(i)**3]/6 + [[u+3]/2][c(i)*#2] + [[3n+5]/2]e(i) + Net 


Eoipie:) @ S500, c(i) = 10, c(2) = 20, c(3) = 30 


Ltera tion Number On multiplicative operations 
1 D545 
2 mer 4o > 
S 253926 
mOTHL = 404754 
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Technigue ed: 


ll 


Step 
Step 
Step 
Step 
Step 


Step 


Step 


Step 
Step 
Step 
Step 
Step 
Step 
Step 
Step 
Step 


Step 


Step 


Step 


WK 


2: 


18: 


Set i =? 
Form R(i) using Eq. 
Form r(i) using Eq. 


Invert R(i) 


oS 
{4.6} 


Block Form Recursive Growth 


Vy icwme term vector x(n,i) 


Soive for eG) using Eq. {4.3} 


If ra) < acceptable level, stop. 


Else; 


Sey 2 i*1, form a 
vector w(n,i/i-1) 
Form A(i/i-1) using 
Porm B(i/i-1) using 
Form h(i/i-1) using 
Porm F(i) using Eq, 
Form G(i) using Eq. 
Porm a(i) uSing Eq. 
lmvert @(i) 

Form k(i) using Eq. 


Solve for Teed) 


re ee < acceptable level, stop. 


uy 


ise; 


Form inverse of A(i) using 


Bq. (4ea7} 


Goetororep 7 


new term 


Bo. 14.27} 


Eq; 


{4.26} 


Eq. {4.29} 


{4.30} 
{4.31} 
{4.32} 


{4.33} 


using Eq. 
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{4.36} 


Computational Cas c 


[w+1 Je(i)[e(i)+1]/2 
e(i)[ N+ ] 
[e(i)**3 ]/6 


[e(i)**2 ]+c(i)+N+1 


[n+t ]a(idla(i)+1]/2 
a(i)P[N+t ] 
a(i)[n+t ] 

aq( i) [Pea] 
Pilg 1 eee 2 | 

Pq (i) 

facies] /6 

qe 2 


q(i) 


q(i)[P**2 | 


Pei 2 | 





Cece. Or Steps 1 through 7 is the same as Technique 1. 


homme cOstetor Steps 8 through 17 are; 


Ome = qe 443176 + (PeN+3)(9q(1)4**2)/2 + gli) {NP+[P*##2J)+2P+( 3N+5/2] 


Gocem@er Seen 18 is: O(n) = q(i)(Pp¥¥2] + Plq(i)y**2] 


Example: N = 500, ¢c(1) |= 10, e(2) = 20, ¢(3) = 30 


Iteration Steps Number of multiplicative operations 
1 = 7 33343 
2 8 = 18 84542+2000(for step 18) = 86542 
3 Ont 17 138240 
da by ea — 256127 





Teecnnaque 3; 


Step 


Step 


Step 


Step 


Step 


Step 


Step 


Step 


Step 


See p 


Step 


Step 


oe"p 


Step 


lg 
ia 


Bie 


Us 


ire 


se: 


lias: 


csemmch Indicator Growth Algorithm Computational Cost 


Se e-meie lOnmetenm vector x(n ,i) 
Form R(i) using Eq. {4.5} iNet 1ecr) levi )y+y i172 
fOwmor(i) using Eq. (4.6} c(i) (N+1] 


En@ert Ri) [eCi)**3]/6 


Solve for 5 usSime Eq. {4.3} [eC i) **2]4+c(Ci)+N+1 
ie pact) < acceptable level, stop. 

Else; 
s@t 1 2+ i141, form a new term 
vector w(n,i/i-1) 

Romm I1(j,12) for each termm in 
an v= | )uisinige Eq. (6.291 NP + [2N+4]q(i) 
Select the subset of k terms with 

Vale's Of I(j,12) greater than a 

specified level ny- Reduce the 


vector w(n,i/i-1) to only contain 


Phisesubset of k terms No "Gest 
Form ACi/i-1) using the reduced 
Viemcioonwmcn.,i/i=1) in Eq. (4.27} kKCK+1]{N+1]/2 
Form B(i/i-1) using the reduced 
veemorew(n,i/i-1) in Eq. {4.26} KPCN+1] 
rommenh(i/i-1) using the reduced 
Welemrorawin,i/i-1) in Eq. {4.29} KUN+1] 
rorm F(i) using the reduced 

econ wGn,i/i~1i) in Eq. {4.30} ee 2 | 
Form G(i) using the reduced 

Wemmor WOm,i/7i-1) in Eq. {4.31} P[kK#*2 ] 


lisse 








step 15: Form g(i) using the reduced 
FecGergw(n,i7i-t) in Eq. {4.32} Pk 
Stee 16: Inyert Gi) (res = 331075 
Seep if: Form k(i) using Eq. {4.33} (k*¥#2] 
Se@&p 6: Solve for ‘me Us ime Equ. {4 . 3.6} k 
Step 19: If ee) < acceptable level, stop. 
Else, 
Sieoe20; HOrm inverse of ACi) using 
Eg. (he 37 } Cee =2 | kKee PL) 
weep. c|l: Go to Step 7 


Gomme for §S 
Technique 


O@a) = 


+ NP +(2N44]q(i) 


Cost for Steps 20 through 21 


OCT) = 


Example: N 


Iteration 


] 


2 


Note: 


teps | 


2 


een oc (1) 
Steps 
1 - 7 
en 


19 


through 7 is the same as Technique 1 


Towed cost for Steps 8 through 


and 


eels 


Ck*##3]/6 + ([k#**2)](P+0N+3]/2] + kL 2OP+PN+(P*#*2)+(3N+5])/2] 


| 


(ew etk + PUK? *2] 


aeeOpmmenee) ©= 20, C(3) = 30, Let k = 3 


Number of multiplicative operations 





33343 

35017+390(for step 20) 35407 
49305 

Om nk = | Wei te3s 


Additional savings can be realized when I(j,13) is 


used to eliminate highly colinear terms in Step 9. 


we 





The preceding example shows that thie wocateh Indicator 
Growth Algorithm can require substantially lower 
computational cost than the other two techniques. Table 7 
summarized the results of the example. Because of this 
lower cost, we can consider a greater number of candidate 
terms during each iteration than would be possible with the 
direct or block-form techniques. This increases the 
probability that we will consider the terms actually needed 
in the model. The performance of this algorithm will be 


demonstrated in the experiments of Chapter VII. 


Technique Cost of Gost of Cost of Total 
; Iteration Iteration Iteration Cost 
1 2 3 


Direct Least Squares 117485 253926 4YO4uTSY 


Block form Recursive 86542 138240 258127 


Search Indicator Growth 35407 49305 113055 





TABLE 7: Computation Cost (Number of Multiplications or Divisions) 
Required im example of Section D, Chapter VI. 
FE. FACTORS AFFECTING MODEL EVALUATION AND GROWTH 
Chapters [I and II mentioned that there were two main 
Paebors that can limit the ability to accurately model a 
system from input and output measurements. These are; (1) 
the ability to control the input signal applied to the 
system, and (2) the presence of output measurement noise. 


The four permutations of these two factors are represented 


in the following table. 
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ae | SYSTEM IagPUT OUTPUT MEASUREMENTS 
. 2 patel RS Delco iO Bees Noise 


WREEPwS: System Characterization Conditions 


Other factors include the form of the system and the 
model Cems: other than BVM), choice of error minimization 
method, and selection of sampling interval (over or under 
sampling is a possibility). It is assumed that these last 
two factors are not a probiem in the examples we consider. 

We have been primarily concerned with Case 2A in this 
thesis because it ailows us to focus on just the choice of 
model terms. In the computer simulated experiments, we 
generate an input probe using a uniformly distributed 
pseudo-random number senerator. The amplitude vaiues of 
this sequence are scaled to cover the known (or assumed) 
operating range of the nominal system input. We then apply 
this input sequence to the system, and use the resulting 
Qutput sequence along with the input probe sequence to grow 
the model by any of the techniques discussed in the 
preceding chapters. 

& uniform distribution was chosen for the input probe 


bew@er chan the gaussian distribution typically mentioned in 
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the literature, based on the following argument. Nonlinear 
terms contribute to the output sequence in a nonlinear 
amplitude dependent manner. Since we don't know the form of 
the nonlinear system terms, select an input probe that is 
equally likely to take on any value in the aliowed range. 
One could, of course, postulate system examples where a 
MGMUNnITOTR input probe amplitude distribution provides more 
efficient modei growth (e.g. more significant differences in 
the behavior of the candidate model terms). 

Case 2B has additive output noise contaminating the 
system output sequence {y(n)}. This is the next step 
ee@gards the situation we must face in the real world. If it 
is reasonable to consider this additive noise to be zero 
mean, stationary, and uncorrelated with the system input, 
then we can perform some fiitering to reduce the distortion 
ex@Mined in Chapter III. Since we still control the input 
sequence, we can measure and record the noisy output 
sequence for M repeated applications of the identical input 
sequence. A point-for-point ensembie average of the noisy 
System output can then be performed, which reduces the 
variance of f{s(n)} Dywuie factor M. Mvs fiiters the output 
variation due to the additive noise, and we can grow a model 
using the input sequence, and the output sequence 
corresponding to the average of the noisy output sequences. 
This technique has been tried experimentaily and produces 


improved results. An #xample is presented in Chapter VII. 


eet, 





Case 1B has no additive output noise but we must work 
with the given input sequence (e.g., we cannot probe the 
system ourselves). If the given input sequence is 
Sag@Ticwently wideband or “persistently exciting" [Ref. 14, 
pp 42), then the least squares matrix A(Ci) will be well 
conditioned at each growth iteration i, and the growth 
Pechnigqgues provide useful results. Bachwspecificwease of 
input signal, system output, and model form must be examined 
experimentally to determine if the evaluation equations are 
well conditioned. Exmamination of the amplitude distribution 
and the empirical sample autocorrelation of the particular 
input sequence gives a qualitative measure of the 
suitability of the available input for systems 
characterization. Much work needs to be done in rating a 
given input signal for use in systems characterization, and 
this is suggested as an area for future re 
Ultimately, it is the value of the obtained final model in 
the intended application that determines the adequacy of the 
PApU Ssh coral wised in the ch@racterization. 

Caormremle @s Ure pest difficult set of conditions for any 
model growth technique. Even if we knew the exact form of 
the final model, and were therefore just doing parameter 
estimation, the output noise would eerade the model growth 
and emaluetivon epror. We may not obtain a useful system 


characterization under these conditions. 
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Case 1A is the situation normally encountered when we 
don't have control of the experiment that obtains the 
measurement data. Two possible techniques to try are as 
follows. Using the given input and output meaSurement 
sequences, we could use the search indicator growth 
algorithm until we reach a limiting number of model terms 
(e.g. N/10), or until there was no significant improvement 
mY tme filttaaege error. AtC Chis point we "freeze" the current 
model and simulate it on the computer. Byeeprobding this 
mathematical model with the given input sequence 
{u(n); S<=n<=T}, we can produce the model output sequence 
{7in)}. UWsa@meg a nonlinear iterative algorithm such as 
Marquardt [{Ref. 17], we could perform an iterative nonlinear 
analysis in an attempt to refine the parameter estimates and 
reduce the Magmitude of the output error @(n) = y(n) = #(n). 
Using this corrected model, and the least squares matrix and 
vector corresponding to it, we could then grow from this 
Dole @srne che search Indicator Growth Algorithm. es 
WNOo—-pnase process could continue until no significant 
decrease in J is obtained. 

Another proposed technique would be to grow a 
nonrecursive model like eRe VOlECd.m) from the input ‘ind 
noisy output measurements, uSing the Search Indicator Growth 
Algorithm. The nolsy output data would not distort the 
coefficients of a nonrecursive model, and it might be 


possible to obtain a reasonable fit. Since there is noise 


aw’ 





geaem tO tne System output, a stopping criterion such as 
independence of the residual sequence {e(n,i)}, as measured 
by its autocorrelation sequence, would make more sense than 
the @magmitude of the fitting error ee). When a 
memrecursive model with a limiting number of terms (e.g. 
BeelO) 1S ODtagsned, or {e(n)} is found to be uncorrelated, 
then a second phase would be used. The previously 
determined nonrecursive model would be used along with the 
input signal to produce the model output {¥(n)}. The input 
signal {u(n)} and the nonrecursive model output {¥(n)} would 
then be used to grow a more general and probably more 
compact recurSive model like the BVM and using the Search 
Paegteaecor Gromthi Algorithm. This concept could be expected 
to reduce the effect of the additive output noise. We 
denote this as the "N-R" technique because it uSes both 
nonrecursive and recursive models. 

This concept is related to a recently developed two- 
stage least squares parameter estimation algorithm for 
linear systems [Ref. 42]. The method presented here is more 
powerful since it is applicable to model growth for 
nonlinear systems and uses the efficient search indicator 
growth algorithm developed in the previous section. 
Experimental analysis of the method discussed in this 


S@et@on 1s providied in Chapter VII. 
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Vil. “EX PERTMENTS IN SYSTEM CHARACTERIZATION 


a. #)rSecussi0n 

The preceding chapter developed the Search Indicator 
Growth Algorithm and showed the computational advantages 
that result from its use. The next step is the experimental 
@valWation Of the peritormance of this proposed algorithm in 
characterizing systems. These evaluations include 
comparisons with the performance of the block-form 
techniques developed in Chapter V. 

This chapter contains several experiments designed to 
demonstrate the strengths and limitations of the model 
growth techniques presented in this thesis. Ineehe first 
six experiments we synthesize a given system equation on a 
computer, and generate a finite length pSeudo-random input 
sequence {u(n)} uniformly distributed between chosen 
amplitude limits. Each case involves probing the system 
equation with the input sequence to create an output 
sequence {y(n)}. These input and output sequences are then 
used aS data points for the model growth techniques. 
Various system features and meaSurement noise conditions are 
Rnelu@@a ior illustrative purposes. 

The advantage of using Mitnesized systems is that it 
allows us to examine the properties of the model growth 


techniques under conditions that do not obscure the key 


159 





differences. We can more clearly see the weaknesses of some 
techniques and verify how other techniques can compensate 
for related problems. The Covariance error minimization 
method is used for each growth technique because of its 
Superior performance (Chapter III). 

Tiem card section of this chapter examines the 
Capabilities of our best growth techniques on a real world 
example, where we must work with the single set of available 
measurement sequences (Case 1A in Chapter VI). Verlfication 
of the modeling results is not as direct in this case since 
the actual system equation is unknown. This real example 
verifies some of the inherent weaknesses of model growth 
techniques when we are faced with Case 1A conditions. The 


final section Summarizes the experimental findings. 


B. COMBROLLED EXPERIMENTS 

The systems used in these experiments were not selected 
to bias the findings in favor of any technique. We have not 
excluded any examples or experiments that produced contrary 
results. The following set of experiments are honestly 
considered to fairly examine the baSic properties of the 
various model growth techniques. We start these experiments 
With Experiment 2, Since Experiment 1 is contained in 
Chapter IIli. 

Exper iment. 2 
The purpose of this experiment is to demonstrate that 


the restricted growth properties of the block-form model 
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growth techniques generally lead to a high condition number 
hormmeeme. Weast sdi@ares matrix A(i). This situation can 
extend to the extreme point of ill-conditioning where these 
growth techniques fail to converge on an adequate model. 
The Search Indicator Growth Algorithm allows unrestricted 
momged growth, is robust to ill-conditioning, and typically 
finds an acceptable model when block-form techniques fail. 
We synthesize the following nonlinear system. 

4m) = 1.0 u(n) + .8 ulme1) + .6 uln-2) =- .9 y(n-1) 

= ./ ¥(@—e) ee 24 utn)ut(n) - .2 ul(n-1)u(n-1)y(n-3) 

- .1 y(n-1)y(n-2) y(n-3) - .12 u(n)y(n-3)y(n-3) en 
A random input probe {u(n);1<=n<=200} is generated uniformly 
distributed between the amplitude limits of -2 and +2. The 
system output sequence {y(n)} is produced by probing the 
system of Eq. {7.1} with the input sequence {u(n)}. 
Starting with evaluation of the base model BVM(1,1), we 
recursively grow models by each of the six block-form growth 
techniques” of Chapter V and the Search Indicator Growth 
Algorithm of Chapter VI. The condition Foner anid error tic 
for eacn model are evaluated at each iteration, and the 


results presented in Table 9. We also include the results 





oO Seeuimschne "M Directed" and "“D Directed" growth 
algorithms require a significance test for switching between 
their two phases (See Figure 21 and the discussion in 
Chapter V). For clarity of presentation, we assume that 
there is a test that recognizes the place to change phases 
after going one increment too far (e.g. we turn after m=3 or 
Gao resmectively). The tables for each of the following 
experiments show where these phase changes are made. 
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ef avdirect Least squares modei evaluation using the exact 
form of the system as a comparison basis. Table i0 contains 
additionai detaiis of the more successful characterization 
Dy tme Se@rch indicator Growth Algorithm. 


SQUARE ROOT 


TTERATION NEt SeUses=D TOTAL CONDITION OF FITTING 
METHOD NAME i MODEL TERMS TERMS TERMS NUMBER SRROR: J 
M DIRECTED 1 3VM(1,1) 3 z 3 .10475+*02 .5183E+00 

2 3VvVu(1,2) 2 2 5 -1769E+02 .4737E+00 
3 avn( 1,2) 2 2 vi -3330E+*02 .4532E+00 
4 3Vm(1,4) 2 2 9 -4177E+O2 .41145+00 
(see footnote 9) 4% 3BYM(2,3) 28 28 35 -1178E+*05 .1540E+00 
5 Bww(3,3) 84 84 ae! -1349B+08 .1668B-02 
D BISECTED 1 37M(1,1) 3 3 3 ~1047E+02 .5183E+00 
2 3VM(2,1) 6 6 9 .13865+03 .34212+00 
3 BYM(3,1) 10 10 19 -41055+04 .32808+00 
4 2VM(4,1) 21 21 40 -4075B+06 .2959B+00 
(see footnote 9) 4* 3YM(3,2) 36 36 55 ~-6731E+05 .1832E+00 
> 3¥M(3,3) 64 64 119 .1334E+08 .2632E-02 
DIAGOBAL ; 1 3¥M(1,1) 3 3 3 ~-10475+02 .5%83E+00 
2 3vm(2,2) 17 ia 20 -92335+03 .237SE+00 
3YM(3,3) 99 99 119 -1265E*08 Diverging 
MeD €2G<Z2 a6 ! 3vVM(1,1) 3 3 z -1047E+02 .51383E+00 
2 3VMN(1,2) 2 2 = -!769EBE*02 .4737E*00 
3 3Vé(2,2) 15 15 20 .9233E+03 .2375E+00 
ry 37M(2,3) 15 15 35 ~1178E*05 .15408+00 
5 3YM(3,3) 84 B84 119 ~-1349E+08 .1658E-02 
D=-4% z2ic-7% 1 3VvM(1,1) 3 3 3 -1047E+02 .5183E+00 
2 3VM(2,1) 6 6 ) -1386B*03 .34215+00 
3 averee,2) it 11 20 .92372E+03 .2775E*00 
4 3VE(3,2) 35 BS 25 -6731E+*05 .18325+00 
= 3¥M(3,3) 64 64 119 -1334E+08 .25325-02 
NEIGHBOR 1 3vee(1,1) 2 c 3 ~1047E+*C2 .51833E+00 
2A 3VM(2,1) 5 5 ] -1386E°03 .34215+00 
23 eet, 2) 3 Bi 6 ~1769E+02 .472378+00 
2c 3Vm(2,2)° 17 i 20 .92735+03 .2375E+90 
BA. B¥YM3,2) 35 55 eS ~67448+05 .18325+°*00 
33° 3VM(2,2 15 15 ZS -1178BE+05 .1540B+00 
3¢ 6G =) «O99 99 119 -{334E+08 Diverging 
SEARCH 1 ayve(:.t) 3 3 3 .10475+02 .51835+00 
IEDICA?S 2 BYM(3,3) 116 1 4 ~1125EB+02 .3941E+00 
z BVE(35,3) 115 z iat -1341E*O3 .23465+0C 
4 3VMI(3,3) 108 4 15 ~-37115+03 .82085-01 
5 3¥m(5,3) 194 2 eg .62628+03 .1037E-05 
SRMOT MOSEL! OF PEF. SYSTEM 9 9 9 -3134E°02 .43175E-06 
TABLE 9 Sunnary Results from Exverinent 2 








Experiment: 2 PVeration: 1 Candidate Model: BVM(1,1) 
Number of candidate model terms, q(i) = 3 


Candidate Model Terms After First Phase Reduction: 


if Tera Tejicic) I(j,13) Related to Term 
1% u(n) -6583E+00 = - 
2% yoo") ~3040E-01 - = 
aa u(n-1) -elaoe—02 - ~ 


iT} 
LW 


Number of terms in final subset (marked with *), N 


b 
Lad 


foetal number of terms in resulting model, c(i) 
Condition Bumeer of least Squares matrix A(Ci), N= - 1047E+02 
Saumaete rcer of the fitting error, J(i) = .5183E+00 

Rémarks: We chose to select all the candidate model terms. 
Experiment: 2 Diieermatione. 2 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) 

Candidate Model Terms After First Phase Reduction: 

i} Term TiGaiges 2 ) TC), 13) gRelated to Term 


i? Un martn ) . 1073E+00 


Number of terms in final subset (marked with #*), yea 1 


Teteal number of terms in resulting model, e(i) 4 
Condition Number of least squares matrix A(i), N 2258-02 
Square root of the fitting error, J(i) = .3941E+00 
Remarks: This term had an I(j,12) more than twice as large as 


all otter terms, so only this term was selected. 





TABLE 10: Search Indicator Growth Algorithm results of 


Experiment 2 (continued on next page). 
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Experiment: 2 Iteration: 3 


Number of candidate model terms, q(i) = 115 


Candidate Model: BVM(3,3) 


Candidate Model Terms After First Phase Reduction: 


ak Term Etc ) I(j,13) Related to Term 

1* y(n-2)y(n-2)y(n-2) .29458-01 .9583E+00 u(n-2)y(n-2)y(n-2) 
2% y(n-2) ~2701E-01 .7454E+00 u(n-2)y(n-2)y(n-2) 
5 u(n-2)y(n-2)y(n-2) .2410B-01 .9583E+00 y(n-2)y(n-2)y(n-2) 
4*® u(n-1)u(n-1)u(n-3) .22378-01 .1137B-01 u(n)u(n)y(n-2) 

5* y(n-1)y(n-2)y(n-3) .1975B-01 .8015E+00 u(n-2)y(n-1)y(n-3) 
6* u(n-2)y(n-1)y(n-3) .1956B-01 .8015B8+00 y(n-1)y(n-2)y(n-3) 
fw uln)ul ae n=2 ) -1895E-01 .5954E+00 y(n-2) 

s* u(n-1)u(n-1) ~1822BE-01 .6420E-01 u(n-2)y(n-2)y(n-2) 
Number of terms in final subset (marked with *), Pa 7 

Total number of terms in resuiting model, c(i) = 11 

Condition Number of least squares matrix A(i), N.= .13115+03 


e@@are root of the fitting error, 


Cc 


i) = = 25468700 


Remarks: The first phase picked terms with I(j,12) > .17E+00 


and the second phase kept terms with I(j,13) < 0.90 


TABLES 10: (continued) 





Experiment: 2 Iteration: 4 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) 


Candidate Model Terms After First Phase Reduction: 


Term Tulle) iUji.05) Rediaitied to Term 
u(n)y(n-3)y(n-3) .8122E-02 .9072E+00 u(n)u(n-3)y(n-3) 
u(n-2)u(n-2)u(n-2) .6511E-02 .8535E+00 u(n-2) 

u(n-2) .6317E-02 .8535E+00 u(n-2)u(n-2)u(n-2) 
u(n)u(n-3)y(n-2) -5187E-02 .9072E+00 u(n)y(n-3)y(n-3) 
u(n-1)u(n-1)u(n-2) .5140E-02 .6022E+00 u(n) 

Number of terms in finai subset (marked with * > 

Total number of terms in resulting model, c(i) = 

Condition Number of least squares matrix A(i), aS ~- 37TO9E+03 


Square root of the fitting error, J(i) = .8208E-01 


Remarks: The first phase picked terms with I(j,12) > 0.50E-02 


and the second phase kept terms with I(j,13) < 0.90 
® 





TABLE 10: (continued) 





Experiment: 2 lweiratiem: 5 Candidate Model: BVM(3,3) 


Number of candidate model terms, q({i) = 108 

Candidate Model Terms After First Phase Reduction: 

# Term Tien 2 ) I(j,13) Related to Term 

1* u(n-1)u(n-1)y(n-3) .9677E-03 .6222E+00 u(n-1)y(n-1)y(n-3) 
ow n=1)y(n-1)y(n-3) .58258-03 .6222F+00 u(n-1)u(n-1)y(n-3) 
Number of terms in final subset (marked with *), N.= 2 

Total number of terms in resuiting model, c(i) = 17 

Condition Number of least squares matrix A(i), N= -6262E+03 
Sauare reot of the fitting error, J(i) = .1037E-05 


Remarks: The first phase picked terms with I(j,12) > 0.50E-03 


and the second phase kept terms with I(j,13) < 0.90 


(continued) 





Table 9 indicates that two of the block-form techniques 
produced excessively ili-conditioned least squares matrices 
and were unable to 2 Selved. Matrix G(i), given by Eq. 
8.31), became singular and this stopped the evaluation. 
The other four biock-form techniques had very high condition 
numbers but were abie to converge on the BVM(3,3) which 
subsumes the system of Eq. eee Each case produced a 
considerabie number of unnecessary model terms and many had 
significant coefficient values (as iarge as .10E-01). It 
would be difficuit to identify these terms as unnecessary 
without having knowledge of the system equation. Both the 


Sacare ree. or the fitting error and the condition number of 
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the least squares matrices corresponding to the models from 
each of these biock form techniques are much larger than the 
exact model case. This leads us to question the value of 
the resulting models. 
mae earch imdicator techmique did mot suffer from these 
problems, and settled on the model equation described below. 
y(n)= 1.0000E8+0 u(n) +0.7999E+0 u(n-1) +0.6000B+0 u(n-2) 
-0.8999B+0 y(n-1) -O.7000E+0 y(n-2) +0.4000E+0 u(n)u(n) 
-0.2000E+0 u(n-1 )u(n-1)y(n-3) -0.1000E+0 y(n-1)y(n-2)y(n-3) 
-0.1200B+0 u(n)y(n-3)y(n-3) -0.2384E-6 u(n-1)u(n-1) 
+0.1490B-6 u(n-1)u(n-1)u(n-1) -0.2962E-6 y(n-2)y(n-2)y(n-2) 
-0.6985E-7 u(n)u(n)y(n-2) -0.5765E-6 u(n-2)y(n-1)y(n-3) 
SO 507 sero Ulmel )u(n-1)u(n-2) -0.1612B-5 u(n-2)uCn-2)u(Cn-2) 
~0.{21968-6 u(n-1)y(n-1)y(n-3) ae 
It is obvious that we can ignore the terms beyond the ninth 
term®@ in Eq. ee |: Table 9 shows that the square root of 
meer ree tang em@ror from the Search Indicator Growth Algorithm 
was better than three orders of magnitude lower than any 
error obtained by the biock form techniques. The condition 
number and fitting error produced by this algorithm are 
Pee iSticaiity ciosee to the values produced by direct 
aenaiyeis of the exact model. The square root of the fitting 
error for the exact model was not exactiy zero, which 
indicates that some numerical roundoff error existed in the 
computer program. We actuaily computed yor as and then took 


tire sqUare root. The non-zero vaiue for the square root of 
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the exact model fitting error J(i) = .4176E-6 translates to 
a p73) of .1744E-12, which is within the expected numerical 
range of zero for the computer. 

Table 10 shows the operation of the Search Indicator 
Growth Aigorithm. Notice how rapidly this technique 
Seeeetea te critical Model terms. The line titled 
"Remarks gives a summary of the heuristic decision making 
rules used for acceptance of the particular candidate model 
tewas in each phase of the algorithn. 

This experiment demonstrated the weakness of the block 
fémae techniques resulting: from their restricted form of 
model growth. It is logical to expect that as one 
arbitrarily adds more and more sets of model terms, the 
probability increases that two or more terms will be nearly 
linearly dependent (colinear). This would result in a large 
increase in the condition number of the least squares matrix 
A(i). This conjecture was also tested by evaluating search 
indicator I(j,13) for all of the terms added at each growth 
iteration by the block form techniques. In all of these 
cases, there were numerous occurrences of I(j,13) vaiues 
Breacer than 0.90, and this appears to explain the observed 
ili-conditioning. Any growth techniques that do not check 
for and somehow handie colinearity among the model terms 
Wiil have simiiar probiems in characterizing systems. tne 
Search Umdicator Growth Algorithm effectively handies this 


problem with search indicator I(j,13). 
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Experiment 3 


This experiment examines the performance of the various 
growth techniques when the system under test actually has a 
Significant delay factor L (previously discussed in Chapter 
VI). The block-form techniques do not have any provision 
rOr recoegniZime this condition during the growth iterations, 
and therefore include unnecesSary model terms. 

We synthesize the following nonlinear system. 

y(n) = 1.0 u€n=$4) +.8 u€n=-5) =.4y(n-1) +.15 u€n-5)y(n-2) {7.3} 
Using the same input sequence (length N = 200) as Experiment 
2, we probe Eq. {7.3} to produce the system output sequence 
{y(n)}. We grow models by the M Directed, D Directed, 
Neighbor, and Search Indicator techniques. The other three 
block-form techniques would require more than the available 
200 measuremengs to evaluate a BVM(2,5) model, and it was 
dé@écided not to include them. 

We started the Search [Indicator Growth Algorithm by 
initially considering the candidate terms in BVM(1,9), the 
highest memory linear model that could be handled by the 
computer program. The largest value of I(j,12) was used to 
Specify Which term to include in the first model. Using the 
degree and memory of this first selected term, we 
WetrTetically consider the candidate set specified by the 
BVM with one increaSe in degree and one increase in memory. 
Tre condition nwamber and error fit for each model are 


evaluated at each iteration, and the resultsS are presented 
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um Bape 11. Table 12 contains the full details of the more 
compact characterization by the Search Indicator Growth 


Algorithn. 












SQUARE ROOT 


TTERATION NEW ShpmeeeD TOTAL CONDITION OF FITTING 
METHOD WAXES i MODES TERMS TERMS Petus NUMBER ERROR; J 


¥ DLRSCTED 1 3BVM(1,1) 3 3 3 ~-1254BE+01 .1266E+01 
2 sva(1,2) 2 2 5 .{813B+01 .1250E+01 
3 3VM(1,3) 2 2 ve ~24118+01 .12445+01 
4 3VM(1,4) 2 2 Q .2820E+01 .38295+00 
5 3vmM(1,5) 2 2 14 -4635B+02 .2274E+00 
6 3ym(1,6) 2 2 13 -2048E+03 .2272E5+00 
(see footnote 9) 6* BVM(2,5) 66 66 seh ~1425B+05 .22938-04 
D DIRECTED 1 Bve(1,!) 3 3 3 .12545+01 .1266E+01 
2 3VM(2,1) 5 6 9 ~7T4628*01 .12445+01 
3 3VM(3,1) 10 10 Le) ~3730E+03 .12248+01 
(see footnote 9) 3* Bvm(2,2) 11 11 20) ~1705E+02 .11945+0!1 
4 3vm(2,3) 15 15 35 .3965B+02 .11245+01 
5 3vVmM(2,4) 19 19 S4 -717258+02 .192958+00 
5 3VM(2,5 23 es Ta ~14248+05 .1974E-05 
NEIGH3OR 1 sye(t,1) 3 3 3 ~1254E+01 .12668+01 
24 pv (2,1) § 4 9 .7462E+*0!1 .1244E+01 
23 avm( 1,2) 3 y 5 -1813E*O1 .1250E+01 
2e B3VM(2,2)* 17 17 20 -1705E+*02 .1194E%0! 
3A ae(3,2) 35 55 55 »1035E+04 .10625+01 
35 BVM(2,3) 15 15 35 ~-3965E+*02 .11242+01 
3C BvM(3,3)* 99 99 119 ~50515+04 .71208+00 
(fsetnote 10) Fs 34(2,4) 19 19 138 -6516E*04 .386055-01 
5 3VM(2,5) 2% 23 161 -2571E*06 .5546E-04 














SEARCH 1 3vu(1,9) 19 1 1 .1000B+01 .5803E+00 
INDICATOR 2 BYM(2,5) 76 2 3 .1802E+02 .2316E+00 
5 3vm(2,5) Ts 2 5 ~3206E+02 .66532-06 








SxaCT ZODSL OF FXE SYSTER 


f= 
f= 
{> 


-1802E+02 .5986E-06 


TABLE ti: Summary Resuics from Experiment 3 


2 


'O Note shat the Neighbor Search sechnique seiected the 3V"(%,3) model over 
TMM others ac iteration 5. Since we oniy have 2090 data soints in our 
measuTrement sequences, ve cannot continue to evaluate the EVM neighbors of 
BVM(2.3). We chose to aodify the neighbor growth algcrithm and consider the 
mew teras specified by the next model we can encompass, 3BVM(2,4). Therefore 
at iteration 4, our overail model contains terms from both 3VM(3,3) and 
3¥M(2,4). In a similar manner we cannot evaiuate 3VM(7,4) because it would 
mmguirr’ @t Least 219 data poirts. We chose to use she same aodificatrion at 
~seration 5 and consider the new terms specified by the next modei we can 
encompass, 3BYM(2,5). Therefore at iteration 5, our averaiz nodel contains 
terms from both 37M(3,7) and 3V¥VM(2,5). Yad we not nade this noditication, 
resuits simiiar <9 experiment number 4 would te obteined. If we had more iata 
polats, w@ wouid Rave b@en a@bie to fotiow tne unmodified zrowth algecrithms and 
wouird have sbta@ined results similiar to those of experiment aumber 2. 
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Experiment: 3 Meeration: 1 Candidate Model: BVM(1,9) 
Number of candidate model terms, q(i) = 19 


Candidate Model Terms After First Phase Reduction: 


# Term eejel.2 ) I(j,13) Related to Term 
1 # u(n-4) - 1409E+01 = = 
Number of terms in final subset (marked with *), Nee 1 


i 
—- 


Total number of terms in resulting model, c(i) 


- 1000820) 


Condition “Wumber of least squares matrix A(i), Ne 
Saumare root of the fitting error, J(i) = .5803E+00 
Remarks: We picked the one candidate model term from 

the candidate set with the highest I(j,12). 
Experiment: 3 beermation: 2 Candidate Model: BVM(2,5) 


Number of candidate model terms, q(i) = 76 


Candidate Model Terms After First Phase Reduction: 


i Term #(9 4,729 TG}, 13) Related to Term 
1# u(n-5) .2278E+00 .7982E+00 y(n-1) 
2 (m1) .1029E+00 .7982E+00 u(n-5) 


Number of terms in final subset (marked with *), > 2 


Total numbér of terms in resulting model, c(i) = 3 
Condition Number of least Squares matrix A(i), NF .1802E+02 
vrdre roam of the fitting error, JCi) = .2316E+00 
Remarks: The first phase picked terms with I(j,12) > 0.70E-01 


and the second phase Kept terms with I(j,13) < 0.90 





Tabet 22 eoeereh Indicator Growth Algorithm results of 


Experiment 3 (continued on next page). 


a 





Experiment: 3 Tteration: 3 Candidate Model: BVM(2,5) 
Number of candidate modei terms, q(i) 

Candidate Model Terms After First Phase Reduction: 

# Term Bier: ) I(j,13) Related to Term 
1* u(n-5)y(n-2) -5363E-01 .8158E+00 y(n-1)y(n-2) 
2* em 1 )y(n-2) .4383E-01 .8158E+00 u(n-5)y(n-2) 
Number of terms in final subset (marked with *), N.= 2 


f 


Total number of terms in resulting model, c(i) = 5 


Condition Number of least squares matrix A(i), Lb .3206E+02 


Square root of the fitting error, J(i) = .6653E-06 
Remarks: The first phase picked terms with I(j,12) > 0.105-01 


and the second phase kept terms with I(j,13) < 0.90 





TABLE 11: (continued) 

This experiment shows that the Search Indicator Growth 
Aigorithm can provide a better conditioned solution (over 2 
orders of magnitude lower) than the other growth techniques 
When the syStém has a Significant delay factor lL. The 
biock-form techniques used in this experiment converged on a 
larger model with reasonably smali fitting error. These 
soiutions however had higher condition numbers and required 
4 significantly iarger number of multiplicative operations. 

The search indicator algorithm converged on the 
following model equation; 

y(n) = 1.0000B+0 u(n-4) +.8000E+0 u(n-5) -.4000E+00 y(n-1) 


+.18975-7 u(n-1)u(n-2) +.1500E+0 u(n-5)y(n-2) {7.4} 


a fr 





We experimentally developed the technique used to 
specify the subsequent sets of candidate model terms based 
Om the terms selected in the first iteration. It is denoted 
as the Candidate Model Specification Technique in the work 
tHat follows. This heuristic technique works well but it is 
acknowledged that there undoubtedly are caseS where it may 
fail to specify a suitably inclusive set of candidate model 
terms. The resulting model may be suboptimal in these 
cases, and other candidate model term specification 
techniques need be considered. 

The major strength of the Search Indicator Growth 
Algorithm is its ability to efficiently select the best 
performing model terms from the candidate set. [tas 
jmaportant to insure that the candidate set is large enough. 
There is no Known way to guarantee ahead of time that this 
goal is met. It remains necesSary for the user of this 
algorithm to recognize when, and if, the candidate set is to 


be expanded. 
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Experiment 4 


The purpose of this fourth experiment is to show that 
even for linear systems, the Search Indicator Growth 
Algorithm can provide more efficient systems 
characterization than the widely used recursive-in-order 
teemniauves like these of Box and Jenkins [Ref. 17]. This is 
a simplified example of what can also happen when block-form 
techniques are uSed on more general nonlinear systems. 

Consider the following linear system equation. 

y(n) = 1.0 u(n) +.5 u(n=-3) +.3 u(n=-8) -.6 y(n-3) -.4 y(n-7) {7.5} 
Using the same input sequence (length N = 200) as Experiment 

2, we probe £q. {7.5} with f{u(n)} to produce the system 

output sequence fy(n)}. We then grow models by the M 

Directed Growth technique (with d=1) and the Search 

ereacetor Growtn Algorithm. Fixing the degree at d=1 

reduces the M Directed technique to an equivalent form of 

the Box and Jenkins technique. LteMiswonylTouswtnat the other 
block form techniques would add many unneeded nonlinear 

terms, and they are therefore not considered here. 

The condition number and error fit for each model are 
evaluated at each iteration, and the results are presented 
Mmounee le Aa. We also include the results of a direct least 
squares model evaluation using the exact form of the system 
aS a comparison basis. Additional details of the 
characterization by the Search Indicator Growth Algorithm 


are presented in Table 14. 
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SQUARES x200T 






ITERATION NEW SELECTED POPAL CONDITION OF FITTING 
METHOD MAME i. =60DEL TERMS ™ORMS TERMS NUMBER ERRORs J 
M DIRECTED 1 3VM(1,1) 3 3 3 -6055E*01 .9201E6+00 
2 3VM(1,2) 2 2 S .1821E+02 .87248+00 
3 BVM(1,3) 2 2 aq 623513402 .49325+00 
(footnote 11) 4 37mM(1,4) 2 2 9 293445+02 .49318+00 
| 5 BYM(1,5) 2 2 11 -1413E+03 .47285+00 
§ 37M(1,5) 2 2 13 ~-1820E+03 .46530E+00 
7 Bve(:1,7) 2 2 15 .25068+03 .1820E+00 
8 3¥M(1,8) 2 2 7 ~1607E+04 .56165-05 


SstaRgC 8 
INDICATOR 


ob 
Ly © 
oh 
—> 


3BYM(1,9) ~1000B+01 .12542+01 
3VM(1,9) 12 2 3 22525201 .53938+90 
3V™(1,9) 1 5 -1378E+*02 .3133865-06 


LA PhO 








SAGE MODSL OF THE STS 2a 5; 5 5 JOS55SeO1 .71065-C5 


fxeus 13 Summary Results from Experinent 4 
Sxveriment: 4 Iteration: 1 Candidate Model: BVM(1,9) 


Number of candidate model terms, q-( i) = 19 
Candidate Modei Terms After First Phase Reduction: 
# Term L Gee e) I(j,13) Related to Term 
1 u(n) 11698 +01 - - 
Number of terms in final subset (marked with *), as 1 
Total number of terms in resulting model, c(i) = 1 
Condition Number of least sauares matrix A(i), N= + 1000E+01 
Saute roet Gf the fitting error, J(i) = .12548+01 
Remarks: We picked the one candidate model term from 
mime Candidate set with the highest I(j,12). 
feel e@ 4: Searen Indicator Growth Aigorithm results of 


Experiment 4 (continued on next page). 


Meteo 


















Exveriment: 4 








Iteration: 2 Candidate Model: BVM(1,9) 


Number of candidate model terms, q(i) = 18 


Candidate Model Terms After First Phase Reduction: 





= Term T(y 312) L(e alo Retaced to. Tern 








y(n-7) .1030E+01 .1288E+00 y(n-3) 







y(n-3) -6925E+00 .1288E+00 y(n-7) 
Number of terms in final subset (marked with *), 


Total number of terms in resulting model, c(i) 


ii 
WN 












Condition Number of least squares matrix A(i), N ~2523E+01 


c 








Squm@remroot of thesefitting error, J(i) * .5393E+00 





Remarks: The first phase picked terms with I(j,12) > 0.60E+00 


and there was no required reduction in phase 2. 





Experiment: 4 ltmweratiiem: 3 Candidate Model: BVM(1,9) 












Number of candidate model terms, q(i) 





Candidate Modei Terms After First Phase Reduction: 








iG.) I(3,13) Related to Term 


Term 











u(n-8) ~1272E+00 .3897E+00 y(n-8) 






y(n-8) .1088E+00 .3897E+00 u(n-8) 





an u(n-3) -7227E-01 .8644E-02 y(n-8) 






Number of terms in final subset (marked with *), 


Total number of terms in resulting model, e(i) = 6 












Condition Number of least squares matrix A(i), Nee - 151 SE ee 






Square root of the fitting error, J(i) * .5186E-06 





Remarks: The first phase picked terms with I(j,12) > 0.50E-01 


and there was no required reduction in phase 2. 


PEGRE 14: (continued ) 
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This experiment shows that the Search Indicator Growth 
Algorithm can converge on an accurate model of a masse. 
system in fewer iterations than the M Directed Growth 
teennique. ie comeditcion number of the least squares matrix 
AC(i) is significantly lower, and therefore the variance of 
the model coefficients 1s lower when the Search Indicator 
technique is used. We also achieved a lower error and found 
that there was little dependency among the final model 
terms. The following model was obtained with the Search 
Tudtseator Growth Algorithm. 

y(n) = 1.0000E+1 u(n) -.6000E+0 y(n-3) -—-.4000E+0 y(n-7) 
-.5000E+0 u(n=-3) +.3000E+0 u(n-8) -.3847E-7 y(n-8) {7.6} 

The main reason these results were obtained, is the 
Dectocubar form of Eq. {7.5}. The M Directed technique 
could not take advantage of the fact that there were 
unnecessary terms ina full BVM(1,8) form, and consequently 
mae tO ineclwae all 17 of the terms. Tables 13 and 14 show 
how the Search Indicator Growth Algorithm efficiently 
converged on an adequate model, based on performance 


evaluation of the set of candidate model pennae 





1) MW iceonrvenvional growth stopping criterion in the 
Piveracire (Hef. i7] is when the fitting error J stops 
decreagéing Significantly. In this example, the M Directed 
growth algorithm (with d=1) could therefore indicate that 
g@rowen Strould Stop at iteration 4; this could result in an 
miicerior model. 
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Experiment 5 


Tmwis etperiment shows how the finite length of the 
measurement sequences prevents the block-form growth 
techniques from converging to an accurate model. Chapter VI 
described how these block-form techniques could not be used 
to evaluate models with more terms than the number of data 
measurements in the sequences. The net effect is that only 
a limited set of model terms can be considered, based on the 
number of available measurements. The Search Indicator 
Growth Algorithm is shown to be unaffected by the size of 
the data sequences, and is capable of considering a nearly 
unlimited number of candidate model terms. hae ability to 
efficiently evaluate avery large set of candidate model 
terms, and cut down to a small and meaningful subset, is one 
of the main strengths of the algorithn. 

We synthesize the following nonlinear system. 

y(n) = 1.0 u(n) +.8 u(n—-1) +.6 u(n-2) +.45 u(n-3) 
-.9 y(n-1) -.7 y(ne-2) =-.25 y(n-3) +.1 u€n)u(n-1)u(n-2) 


-.15 y(n-1)y(n-2)y(n-3) -.35 u(n=2) y(n-3) 


Pmevoneatyy(n—=2) —.18 y(n=2)y(n=2)y(n-3) aa 
We use a random input probe (length N = 100) uniformly 
distributed between the limits of -1 and +1. The system 


output sequence f{y(n)} is produced by probing the system of 
EGey lie itewetn the input sequence {u(n)}. Starting with the 
evaluation of the base model BVM(1,1), we recursively grow 


the model by each of the six block-form growth techniques of 
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Chapter V and the Search Indicator Growth Algorithm of 
Chapter VI. Whenever a growth technique reaches the point 
where insufficient data measurements are available, we stop 
the growth. The Candidate Model Specification Technique is 
used for the search indicator growth, starting with the 
initial model BVM(1,9). 

Me s,condition number and fitting error for each model 
are evaiuated at each iteration, and the results are 
presented in Table 15. We include the results of a direct 
least squares model evaluation using the exact form of the 
system as a comparison basis. Table 16 contains additional 
d@t@eiiss of the more successful characterization by the 


Search Indicator Growth Algorithn. 
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ExXperiment: 5 Iteration: 1 Candidate Model: BVM(1,9) 
Number of candidate model terms, q(i) = 2 


Candidate Model Terms After First Phase Reduction: 


# Term a@ieel.2 ) I(j,13) Related to Term 
1# u(n) -3986E+00 - ~ 

a y(n-2) -2369E-01 ~ - 

Number of terms in final subset (marked with *), Ne 2 

Total number of terms in resulting model, e(i) = 2 

Condition Number of least squares matrix A(i), NAF . 1686E+01 


Square root of the fitting error, J(i) = .2878E+00 
Remarks: These candidate model terms has values of I(j,12) 


that were far greater than those of the other terms. 


TABLE 16: Search Indicator Growth Algorithm results of 


Experiment 5 (continued on next page). 
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Experiment: 5 liemation: 2 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 117 

Candidate Model Terms After First Phase Reduction: 

i Tern Ee. ike ) Lido) Remiated tomPern 

1* u(n-1)u(n-3)y(n-2) .27148-01 .8895B+00 u(n-1)y(n-2)y(n-3) 
Putin 2 ) y(n=3 ) -2600E-01 .1002E+00 u(n-1)u(n-2)u(n-3) 
3* u(n-1)u(n-2)u(n-3) .25208-01 .8434E6+00 u(n-1)u(n-3)y(n-2) 
4 u(n-1)y(n-2)y(n-3) .2484E-01 .8895E+00 u(n-1)u(n-3)y(n-2) 
5* u(n-2)u(n-2)y(n-1) .2339EB-01 .19278+00 u(n-1)u(n-2)u(n-3) 


Number of terms in final subset (marked with *), N= 4 


ul 
Or 


Total number of terms in resulting model, c(i) 





Condition Number of least squares matrix A(i), Ns 3 1094 E05 


Seuwere root of the fitting error, J(i) = .1667E+00 


Remarks: The first phase picked terms with eats Ue > .208=01 


anid Ghe sSlecond phase kept terms with I(j,13) < 0.85 





TABLE 16: (continued) 
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Experiment: 5 Iteration: 3 Candidate Model: BVM(3,3) 





Number of candidate model terms, q(i) = 113 


Candidate Model Terms After First Phase Reduction: 


Ed Term I(j,12) I(j,13) Related to Term | 
t* y(n-3) .5916E-02 .8253E+00 u(n-3) , 
(2* u(n-3) -4575E-02 .8253E+00 y(n-3) : 


| | 
(3* u(n-1)u(n-1)y(n-3) .4070B-02 .6100B+00 y{n-3) ; 
/4* u(n)u(n)y(n-3) .3733E-02 .5592E+00 y(n-3) : 


Number of terms in final subset (marked with *), N= 4 


‘motal number of terms in resulting model, c(i) = 10 


| 
| 
| | 
[Condition Number of least squares matrix A(i), N = .2329B+03 | 
| 


| Square root of the fitting error, J(i) = .1438E+00 


‘Remarks: The first phase picked terms with I(j,12) > 0.3200E-02 


and the second phase kept terms with I(j,13) < 0.85 





TABLE 16: (continued) 
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Experiment: 5 Iteration: 4 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 109 


Candidate Model Terms After First Phase Reduction: 


Z Term T(j,12) 15,13) Related to Term 
1* u(n-1)y(n-3) -3760B-02 .8650E+00 u(n-1)u(n-3) 
2 u(n-1)u(n-3) -2548E-02 .8650B5+00 u(n-1)y(n-3) 


Number of terms in final subset (marked with *), a 1 

Total number of terms in resulting model, c(i) = 11 
Condition Number of least squares matrix A(i), N,= .2343E+03 
Square root of the fitting error, J(i) = .12878+00 


Remarks: The first phase picked terms with I(j,12) > 0.2000E-02 


and the second phase kept terms with I(j,13) < 0.85 


TABLE 16: (continued) 
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Number of candidate model terms, q(i) = 


Experiment: 








5 Iteration: 5 


Candidate Model: BVM(3,3) 





Candidate Model Terms After First Phase Reduction: 


# 


| 
| 
| 
: 
| 
| 
: 
| 





p= =F w-1) 

6* u(n-3)y(n-2)y(n-2) 
(7 u(n)u(n)y(n-1 ) 
ce u(n-3)y(n-1)y(n-3) 


9 y(n-1)y(n-3)y(n-3) 


"O* 


Number of terms in final 
“ing number of terms in 
l 


| Condition Number of least squares matrix A(i), N= 


Term 


i wn) wia-3)y(n-1) 
2 u(n)y(n-1 )y(n-3) 
S% 3 u(n)u(n-2)u(an-3) 


4* y(n-2)y(n-2)y(n-3) 


u(n)u(n-2)y(n-3) 


Lipa’) 
~-1194E-02 
- 1081E-02 
~-8264E-03 
~-7172E-03 
-6958E-03 
-6735E-03 
.6681E-03 
.6619E-03 
6922-03 
-6199E-03 
subset (marked with *), Nee 8 


resulting model, c(i) = 19 


Beet of Gite fitting error, 


16: 


J(i) = 


Tew) 


-8799E+00 
-3T99E+00 
-8426E+00 
- 6808E+00 
-6116E+00 
- 6808E+00 
-6116E+00 
- 9303E+00 
-9303E+00 


»-8426E+00 


Related to Term 


u(n)y(n-1 )y(n-3) 
u(n)u(n-3)y(n-1) 
u(n)u(n-2)y(n-3) 
u(n-3 )y(n-2)y(n-2) 
u(n)u(n)y(n-1 ) 

y (n-2)y(n-2)y(n-3) 


y(n-1) 


y (n-1 )y(n-3)y(n-3) 

| 
TIS aGretn) /(a5) 
u(n)u(n-2)u(n-3) | 


| 
| 
| 


. -6544E+03 


- 1006E+00 


The first phase picked terms with I(j,12) > .6000E-03 


and the second phase kept terms with I(j,13) < 0.85 


(continued) 
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Experiment: 5 Iteration: 6 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 100 


Candidate Modei Terms After First Phase Reduction: 


# Tern Lieaeali2 ) I(j,13) Related to Term 


et eal gs tg a eg nn AN eg a 


me wn)u(a-3) -4679E-03 .4053E-01 u(n-3)y(n-1) 
a wl(n-3)y(n-1) -4498E-03 .4053E-01 u(n)u(n-3) 


ee 0 06u( 3 )a(n-3 ) .42428-03 .3640E8-01 u(n)u(n-3) 


a ae a a eee ee er a 


Number of terms in final subset (marked with *), N.= 3 


Totai number of terms in resulting model, c(i) = 22 


Condition Number of least squares matrix A(i), N .6858E+03 


| 
: 
! 

Square root of the fitting error, J(i) = .8167E-01 | 
| 

Remarks: The first phase picked terms with I(j,12) > 0.33E-03) 


and the second phase kept terms with I(j,13) < 0.85 


TABLE 16: (continued) 








Experiment: 5 Iteration: 7 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 97 

Candidate Model Terms After First Phase Reduction: 

# Term ejaceiee”) I(j,13) Related to Term 


P™ wXa-1 a(n-1)u(n-1) .12808-03 .8497E+00 u(n-1) 


2* u(n-1)u(n-3)u(n-3) .1189F-03 .5654E+00 u(n-1) 


3* y(n-1 )y(n-2) ~11735-03 .7726E+00 u(n-2)y(n-1) 

#* u(n)u(n-1 )y(n-2) -1002E-03 .6958E-01 u(n-2)y(n-1) 

>" u(n-2)y(n-1) -9235E-04 .7726E+00 y(n-1)y(n-2) 

e* u(n-2)u(n-2) -9088E-04 .3713E-01 u(n-1)u(n-3)u(n-3) 
7* u(n-1) -9021E-04 .8497E+00 u(n-1 )u(n-1 )u(n-1) 
os of Femmes in final subset (matked with *), N.= 7 

potas number of terms in resulting model, c(i) = 29 

ae Number of least squares matrix A(i), i BS) Pom ato) 





Square Root of the fitting error, J(i) = .6570E-01 


| 


Remarks: The first phase picked terms with I(j,12) > 0.90E-04 


| and the second phase kept terms with I(j,13) < 0.85 


TABLE 16: (continued) 
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Experiment: 5 Iteration: 8 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 90 


Candidate Model Terms After First Phase Reduction: 


# Term ieee) ieipalie)) SRediatediaiom ern 
1* u(n)y(n-1) -1754E-03 .8446E+00 u(n)u(n-1) 

P= ul(n-2) -1292E-03 .1742E-01 u(n)u(n-1) 

oe u(n)u(n-1) -12425-03 .84668+00 u(n)y(n-1) 
Number of terms in final subset (marked with *), Nae 5 


Total number of terms in resulting model, c(i) 


Be 
Condition Number of least squares matrix A(i), N= .13038+04 


Square root of the fitting error, J(i) = .1429E-01 


Remarks: The first phase picked terms with I(j,12) > 0.1058-03 


and the second phase kept terms with I(j,13) < 0.85 





Experiment: 5 Iteration: 9 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 87 

Candidate Modei Terms After First Phase Reduction: 

# Term ioe hte ) I(j,13) Related to Term 
1™ ¥(n-1 \y(n-2)y(n-3) .10328-04 - - 

Number of terms in final subset (marked with *), N= 1 


Total number of terms in resulting model, Ct) = 35 


Condition Number of least squares matrix A(i), N. 


Square root of the fitting error, J(i) = .6545E-02 


~-1405E+04 


Remarks: The first phase picked terms with I(j,12) > 0.60E-05 


TABLE 16: (continued) 
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Experiment: 5 Iteration: 10 Candidate Model: BVM(3,3) 
Number of candidate model terms, q(i) = 86 


Candidate Model Terms After First Phase Reduction: 


it Tern evade ) I(j,13) Related to Term 
1* u(n)u(n-2)y(n-1 ) Poet O5 8Se4oSk+00 uln)uln=? )u(n-2) 
eee win)u(n-1 )u(n-2) -2892E-05 .7467E+00 u(n)u(n-2)y(n-1) 


Number of terms in final subset (marked with *), Nes 2 


Total number of terms in resulting model, c(i) = 35 


Condition Number of least squares matrix A(i), N.= -1966E+04 
Square root of the fitting error, J(i) = .3882E-05 
Remarks: The first phase picked terms with I(j,12) > 0.15E-06 


and the second phase kept terms with I(j,13) < 0.85 


(MGS RB 


TABLE 16: (continued) 

Teowe 5 showGeerhat the first six growth techniques all 
faiied to converge on an adequate model because they 
exhausted the available data. The Neighbor Growth technique 
came ciosest to generating an adequate model, but it also 
mad tO Pesteret its~model growth choices. 

Ones, qnewocarch Ind#ecator Growth Aigorithm found an 
acceptable model. Note that the condition number of this 
last model (iteration 10) is reasonably ciose to that of the 
exact model of the system, despite the fact that we have 
nearly three times the required number of model terms. 

The Search Indicator Growth Algorithm considered a total 
of 151 different modei terms specified by our selection 


technique, even though there were only 100 data points. 
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This ability enabled it to locate the best model terms to 
accept over the ten growth iterations. There were a number 
of iterations where near colinearity was detected, and Table 
16 shows how the corresponding term with the lowest value of 
I(j.12) was deleted from consideration. This last model 
contains 23 extra model terms, but these unnecessary terms 
are easily identifiable by their low coefficient estimates. 
The equation obtained for this model is; 

y(n) = 1.0000E4+1 u(n) =. 7000E+0 y(n=-2) =-.3500E+0 u(n-2)y(n-=-3) 


~5051E-5 u(n-1)u(n-2)u(n=-3) =-.6385E=-5 u(n-2)y(n-1) 


-.1186E=]$5 u(n-2)u(n-2)y(n-1) +.4500E+0 u(n-3) 

Pees +0 y@me=3) =.1266E-4 u(n-1)u(n-3)y(n-2) 

+, 4830E=]$5 u(n)u(n)y(n=-3) =.1116E-4 u(n-1)u(Cn-1)uCn-1) 
-.9000E+0 y(n=1) +.2841E-4 u(n-1)u(Cn-3)u(n-3) 

—-.5353E=-5 u(n-1)u(n-1)y(n-3) -.1489E=]4 u(n-1)y(n-3) 
-.2538E-4 u(n)u(n-2)u(n-3) -.1800E+0 y(n-2)y(n-2)y(n-3) 
- 4268E-5 u(n)u(n)y(n-1) +.2956E=$4 u(n)u(n-2)y(n-3) 
+.1132E-5 u(n)u(n-3)y(n-1) -.2190E={4 u(n-3)y(n-1)y(n-3) 
+.1779E-5 u(n-3) y(n-2)y(n-2) +.9621E-6 u(n)u(n-3) 

oe 7OPR—-5 u(n=-3)u(n=-3) +.1190E-4 u(n-3)y(n-1) 

+.8000E+0 u(n-1) =-.1500E+0 y(n-1)y(n-2) y(n-3) 

-.2880E-5 u(n-2)u(n-2) +.5000E-1 y(n-1)y(n-2) 

+.8200E-5 u(n)u(n-1) -.9998E-1 u(n)u(n-1)u(n-2) 
-.1520E=]5 u(n)u(n-1)y(n-2) +.6000E+0 u(n-2) 


=,.1210E-4 u(n)y(n-1) +.2016E-4 u(n)u(n-2)y(n-1) ieee: 
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Experiment 6 

This experiment investigates the degraded model growth 
resulting from additive output noise, and the improvement 
that can result when we have control over the input 
sequence, Chapter 6 discussed a method that uses repeated 
application of the identical system input, and calculates a 
point-for-point ensemble average of the corresponding sets 
of system outputs to form an "averaged" system output 
sequence {¥(n)}. Model growth is then attempted using the 
input sequence {u(n)} and this averaged sequence {f(n)}. 

We synthesize the following nonlinear system. 
cree eee OeuCn) + .6 utn=-1) - .4 y(n-1) 

+ .15 u(n-1)y(n-2) + vn) {7.9} 

We generate a random input sequence {u(n);1<=n<=100} 
uniformly distributed between the amplitude limits (-2,2), 
and a random additive noise sequence {v(n);1<2n<=100} 
uniformly distributed between the AMD li cud eo amsts: C=1..1)% 
The sequence {v(n)} is produced with a different random seed 
and is uncorrelated with the input. We produce the noisy 
System output from Eq. {7.9} and grow models by the Search 
Indicator Growth Algorithm. Growth is halted when the 
Plecotme eGrror stops decreasing Significantly, or when the 
conda tion number jumps drastically. These results are 
summarized in the first section of Table 17. 

After reapplying the input probe to the system a number 


of times, an ensemble average of the corresponding system 
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output sequences is performed, and we form the "averaged" 


system output sequence {%(n)}. Various experiments are 


conducted with an increasing number of output sequences 


(ensemble members) used to produce {7(n)}. These results 


@re incituded in Table 17 for 4, 10, 40 and 100 ensembie 


member averages. A direct least squares evaluation of the 


modei, with the exact form of the system and no measurement 


moise, is included for comparison purposes. 


SQUARE ROOT 


DPORAS I On TEM SZUSCTED TOTAL CONDITION OF SITTING 

MSTYOD NAME i MODEL TE2MS TERMS TERMS NUMBER SRROR; J 
SEARCH BYM(1,9) 19 2 2 .1307E+01 .70148+00 
INDICATOR 2 3ym(2,2) 18 2 4 .32738+01 .59508+00 
(1 AVERAGE) 3 pBVM(2,2) 16 2 6 ~1557E+02 .55502+00 
4 BVM(2,2) 14 2 8 ~4703E+02 .53928+00 

5 pvemc2,2) 12 1 9 ~-4740E+02 .5370E+00 

(See Sq. {7.‘0}) 6 pyec2,2) 11 6 15 ~37045+03 .5221E+00 
SSaRCH 1 3¥m(1,9) 19 2 2 -1307E+01 .3784E+00 
IMDICATOR 2 BYM(z,2) 18 2 4 -1745E+02 .26718+00 
(4 AVERAGES) 3 3¥M(2,2) 16 @ | = -1766E+*02 .19552+00 
4 BVM(2,2) 15 { 6 .18465+02 .1832£+00 

5 ewme(2,2) 14 { i; PASSON+O2 «16771200 

(See Eq. {7.11}) 6 ga(2,2) 13 2 e) .36728+03 .14198+00 
SEARCY 1 BYVM(1,9) 19 2 2 .13078+01 .3492E+00 
INDICATOR 2 BVM(2,2) 18 1 3 ~2447E+01 .2285E%00 
(10 S9Satecs ) 3 BYmM(2,2) 17 1 4 -2613E+01 .14486+00 
(See iq. {7.12}) 4 3v“(2,2) 16 4 8 ~-66152+03 .5956E-91 
SEARCH { gvm(1,9) 19 2 2 ~13072+01 .342258+00 
INDICATOR 2 37M(2,2) 18 \ 3 -2443E+01 .21962+00 
(40 AVERAGES 3 3¥M(2,2) 17 1 4 -2618E+00 .1350E+0C0 
(Seb Ba. [7.13}) 4 2vM(2,2) 16 . 7 ~52148+03 .16175-01 
START 1 3V"(1,2 19 2 2 ~-13075+01 .34158+09 
EMDICATOR 2 wvac2,2) 18 ‘ 3 ~244435+01 .21898+00 
(290 sk veRwers) 3 3VM(2,2) 17 1 4 -26198+01 .13468+00 
Sie Zon {[7.14}) 4 paee(2,2) 16 3 a ~5190B+03 .64592-02 

Beact’ (70 Bcise) SySTSK MODEL 4 4 4 259078+02 .1208E-05 


Su@mary gxesults from 


Experiment 6 
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Tis experiment shows how control over the system input 
can be used to reduce the distorting effect of additive 
output noise on system characterization. Both the fitting 
error and the condition number are reduced aS we average 
more data sequences. The Search Indicator Growth Algorithm 
converges quicker and on amore compact model. The 
averaging technique reduces the effect of the output noise 
by a factor equal to the reciprocal of the number of 
ensemble averages used. It is interesting to note that 
pce), tne Sqdware root of the fitting error, dropped by 
approximately the same factor. 

See@etons {7-10}, {7.11}, {7.12}, {7.13} and {7.14} are 
the resulting model equations obtained from the last 
iteration of the growth tests with 1, 4, 10, 40 and 100 
ensemble averages respectively. The actual system equation 
is repeated below for comparison. 

Somer 1.0 Wn) + .8 u(n-1) = .4 y(n-1) 


+ .15 u(n-1)y(n-2) + vn) Tee 8); 


One @ erage: 
y(n) = .10149E+1 u(n) +.45434E+0 u(n-1) =-.18148E+0 u(n-2) 


+,20995E+0 u(n-1)y(n-2) +.14323E+0 u(n)u(n) 
+.19578E-1 y(n-2) -.24651E-1 u(n-2)u(n=-2) 
-,81504E-1 u(n-2) y(n-2) -.96112E-1 u(n)u(n-1) 
mS507928-1 u(n—-1)u(n-1) =-.78196E-1 y(n-1)y(n-2) 
+. 41998E-1 y(n-2)y(n-2) +.67933E-1 u(n)y(n-1) 


Peie2eoeBe 1 utn)y(n—-2) -—.36490E=-1 u(n-1)y(n-1) {7.10} 


Go)3 





Note that the model growth did not select the system term 
y(n-1). The additive output noise is degrading the model 


growth capabilities. 


Four Awerages: 

y(n) = .10179E+1 u(n) +.59065E+0 u(n-1) 
-.94970E-2 u(n-1)u(n-2) -.14650E+0 u(n-2) 
foe e@opoee1 uCn)yutn) —.297T41E=-1 u(n=-2) y(n=2) 
-.18258E+0 y(n-1) +.59640E-1 y(n-2) 


+.16499E+0 u(n-1)y(n-2) re 


Pen, Averages: 
y(n) = .10058E+1 u(n) +.74664E+0 u(n-1) 


mTorr o Ulin—-1)y(n=2) —-.31808E=-1 u(n—2) 
—-.34365E+0 y(n-1) +.90579E=-2 y(n-2) 


+. 16271E-1 u€n)u(n) =-.47212E=-2 y(n-2) y(n-2) fei! 


Forty Averages: 
y(n) = .10003E+1 u(n) +.80620E+0 u(n-1) 


+ .14979E+0 u(n-1)y(n-2) + .76613E-2 u(n-2) 
-.4O0594E+0 y(n-1) -.60805E-2 y(n-2) 


+.13304E-2 y(n-2) y(n-2) Gy are ice: 


One Hundred Averages: 

y(n) = .10001E+1 u(n) +.80304E+0 u(n-1) 
+. 1Y990E+0 u(ne-l)y(n-2) + .35466E-2 u(n-2) 
-.40295E+0 y(n-1) —.26935E-2 y(n-2) 


+.56228E-3 y(n-2)y(n-2) eee is 
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The results presented in Table 17, and in the preceding 
equations, clearly demonstrate the significant improvements 
in model growth and model accuracy that can be obtained when 
me can reduce the effect of additive output noise by 
averaging. This averaging and growth technique is useful 
meremever the statistics of the additive output noise do not 


change during the experiment. 


Experiment ff 


The purpose of this experiment is to demonstrate the 
model growth improvement resulting from use of the two-stage 
"H=R" technique discussed at the end of Chapter VI. This 
technique is applicable to model growth when we have only 
one set of system input and output measurement sequences, 
and the output sequence contains additive noise. We also 
develop alternate criteria for evaluating the fit of a 
model. 

For this experiment, we use the same nonlinear system as 
in Experiment 6. 

y(n) = 1.0 uC€n) + .8 u(n=-1) = .4 yCn-1) 

ee tS utn=—1)y(n—-2) + vn) {7.15} 
The input sequence {u(n);1<=n<=1000} is uniformly 
distributed between the amplitude limits (-2,2), and the 
additive aise sequence {v(n)} is uniformly distributed 
between the amplitude limits (-.2,.2). These two sequences 
are uncorrelated, both with themselves and each other. 


After generating the noisy system output sequence {y(n)} 
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corresponding to Eq. {7.15}, we grow a recursive model using 
the Search Indicator Growth Algorithm and the Candidate 
Modiel Specification Technique. The additive output noise 
degrades the growth but this step is included to show the 
typical results obtained with noisy data sequences. This 
first modeling example is denoted as Test 1 and the results 
are summarized in Table 18. 

Using the VOL model form of Eq. {2.4} and the Search 
Indicator Growth Algorithm, we next grow a nonrecursive, 
nonlinear model from the available meaSurement sequences 
(first phase of the N-R technique). We selected the system 
given by Eq. {7.15} to be of recursive nonlinear BVM form, 
and therefore any finite VOL model produced by our growth 
algorithm can only approximate the performance of the 
system. Since we used the Search Indicator Growth 
Algorithm, the resulting VOL model 1S more compact than any 
block form model using the VOL form. We give extra freedom 
(larger d and m) to the candidate VOL model terms to allow 
for improved growth performance. The results of this second 
modeling example are given as Test 2 in Table 18. 

APeer @vyaluating the coefficients of our final VOL model 
from the preceding growth step, we synthesize it on the 
Compucer and probe it with our stored system input sequence 
{u(n)}. The resulting model output sequence {$(n)} is 
stored, and used with f{u(n)} to grow a recursive model of 


the BVM form using the Search Indicator Growth Algorithm 
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(second phase of the N-R technique). The results of this 
third modeling example are summarized as Test 4% in Table 18. 
We include two direct least squares modeling examples 
using the exact form of the system. First we use fu(n)} and 

fy(n)} to obtain an evaluation of the correct model using 
actual system data. This is summarized as Test 4 in Table 
18. Finally we use {u(n)} and {$(n)} to obtain an 
evaluation of the correct model using the output data from 
the nonrecursive VOL model realization obtained in Test 2. 
This is summarized as Test 5 in Table 18. 

SQUARE ROOT 


POUL  COMDITION |, OF FITTING 
TERMS NUMBER BRROW; J 







MINIMUM 
VALUE OF 
RESIDUAL 
SEQUENCE 
fe(n)} 


EEST DESCRIPTION 


1:  S&GX GRONTH OF A 
RECURSIVS BVM JUSING 
NOISY SYSTEM DATA. 9 pill (sal mies) ©) -.2895 Oe 


2: SIGA GROWTH OF NON- 
RECURSIVE VOL MODEL 
USING NOISY SYSTEM 


DATA. 9 Zee > Sel ah Sie: -. 3048 OOH 


3: SIGA GROWTH OF A 
RECURSIVE BVM USING 
OUTS UD DATA BROM VOL 
HODEL OF TEST 2. 7 A SINS 


4: DIRECT EVALUATION 
OF SXACT MODEL USING 


——- a 
ee ee et ES Se ee ee 


NOISY SYSTEM DATA. 4 19.3 lea -.2971 5 1168 


>: DIRECT EVALUATION 
OF BXACT MODEL USING 
OUTPUT DATA FROM VOL 


MODEL OF TEST 2. 4 20.06 2 ioe -.2961 Beko) AG. 


U 


TABLE 18: Summary Results from Experiment 7 


oer 








MAXIMUM 
VALUE OF 
RESIDUAL 
SEQUENCE 


21147 -.2941 ee 





Equations PT wai. , wes Cee ‘Teteb Taal and 17.20% are the resulting 
model equations obtained from the preceding Test 1, Test 2, Test 3, Test 4 and 
Test 5, respectively. The actual system equation is repeated below for 
comparison. 


y(n) = 1.0 u(n) +.8 u(n-1) -.4 y(n-1) +.15 u(n-1 )y(n-2) + vn) +7. oy 


y(n) = .10059B+1 u(n) +.69807E+O u(n-1) +.10798E-3 u(n-1 )u(n-2) 
+.13629E+0 u(n-1)y(n-2) -.79330E-3 u(n-2) -.29862E+0 y(n-1) 
*.35210E-3 y(n-2) -.13068E-3 y(n-2)y(n-2) 


+.14619B-3 u(n=2)u(n-2) {7.16} 


y(n) = .10064E+1 u(n) +.39759E+O u(n-1) -.16452B+0 u(n-2) 
+.14916E+O u(n-1)u(n-2) +.64402E-3 u(n-2) 
-.66326E-3 u(n-2)u(n-3) -.26126E-3 u(n-1 )u(n-4) 
~.26433E-3 u(n-2)u(n-4) +.28719B-3 u(n-3)u(n-4) 
+.54585B-3 u(n-1)u(n-3) +.85071B-4 u(n-3)u(n-5) 
-.15249E-3 u(n-4) + .12135E-3 u(n-5) 

-.10625EB-3 u(n-4)u(n-6) +.59477E-4 u(n)u(n-8) 
*.63433B-4 u(n-1 )u(n-5) +.56957E-4 u(n-1)u(n-7) 


+.66455E-4 u(n-2)u(n-5) -.61506B-4 u(n-4)u(n-8) iT 
y(n) = .10050E+1 u(n) +.84321E*0 u(n-1t) +.14686E*0 u(n-1 )y(n-2) 
+.33216E-1 u(n-2) -.44315B+0 y(n-1) -.20647E-1 y(n-2) 


+.25836E-3 y(n-1)y(n-2) ee 


y(n) = .10059B+1 u(n) *+.79303B+0 u(n-1) -.39294E+0 y(n-1) 


+.14394B+0 u(n-1)y(n-1) {7.19} 


y(n) = .10047B+1 u(n) +.81040B+0 u(n-1) -.41073E+40 y(n-1) 


+.14214B+0 u(n-1)y(n-2) {7.20} 
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This experiment was designed to show the typical 
modeling improvement resulting from the two-stage "N-R” 
growth algorithm. The nonrecursive VOL model growth phase 
(Test 2) was continued until a fitting error value less than 
the model of Test 1 was obdtained. Table 18 shows that a 
Significantly larger but manageable number of model terms 
are required in this phase. The final VOL modei has a very 
dow condition number. This is due to both the nonrecursive 
nature of the VOL modei form, and the property of the Search 
indicator Growth Algorithm which only picks model terms 
CriteMiag subsiwantial reduction in the fitting error. 

Table 18 shows that the model of Test 3 has fewer terms, 
power Comaltion numoer, and lower fitting error than the 
Medea of Test 1, out not by much. The error of Test 4 is 
Meener than Tests | through 5, but this is primarily due to 
the lower number of final model terms. The error of Test 5 
is the lowest for this number of model terms. 

We know that the modeis of Test 4 and Test 5 should be 
better than the models of Test {! and Test 5, respectively, 
Ponisi S @irtereuit to recognize this from the values in the 
table. The additive output noise causes an offset in the 
DPitvilpmeerrOor ana we cannot use just Chis scalar performance 
criterion J(i) to rate the quality of fit. ‘We instead must 
find some additional characteristics of our obtained model 
fit to demonstrate that we have a meaningful and useful 


growth technique when there is additive output noise. 


1.9 








One possible measure of the quality of fit is ‘the 
amplivude range of the error residual sequence fe(n)} auc ne 
end of each test. A wide spread between the maximum and 
Minimum residuai values would generaily indicate a poor 
model fit. Conversely, a small spread (compared to the 
spread of the system output sequence {y(n)}) would generally 
peepeagse thet we have a good fit. faole 18 inciudes the 
maximum and minimum values of f{e(n)} for the last model 
obtained from each of the five tests. There is some 
G2a@f@réence in’ She spread im each of the five tests, but 
Mocninme S8Pnifican> eMoweh to use as a criterion. The 
Magnivude of the addivive moise masks these performance 
propertiés and we must look for another characteristic. 

Ghapver Vi wlensioned that the residual error sequence 
would be a random sequence -without any identifiable trends 
or patterns if we have adequately modeled the sysvem. This 
Comdivion cam be quatitatively evaluated using a standard 
SEMorecical Geohnique in the literature [Ref. les lie The 
normalized sample autocorrelation plot {r(x); «=0,1,2,...} 


@- 92 Femdom Sequence snould approacn she following forn. 





NORMALIZED 


Pd 

S 1g 

_ 

< 

a A 

ae 

= © 

ce 

@- 

O 2 Sisco ce 

O : 10 

= K 
< 

Wee 25: Typreat auvocorrelation piot of a random sequence 
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Here @ equals the square root of the reciprocal of N, the 


number of data points used in the error minimization 


3 
= = @.032. 


(N =n mt 1) - For example, N = 1000 produces the value 


Following the conventions of Chapters II and IV, the 
following equation is used for the normalized sample 
eurocorrelation of a signal fs(n)} at lag k. 


53 
al a s(n)s(n-k) 
N n=n4 


r(k) 





ro!) {vee it 
A sequence is typically considered to be random if the 
values of {r(k);k=1,2,3,...} lie between toa for at least 
Gee Oo: the normalrzed avitocorrelation plot [Ref. 43]. 

The first seven normalized autocorrelation values of the 
error residual sequences from each of the five previous 
tests have been calculated from the experimental data. The 
autocorrelation values for the random additive noise 
sequence {v(n)} and the random input sequence f{u(n)} have 
been calculated for comparison purposes, and these are 
denoted as Test 6 and Test 7, respectively. These 
autocorrelation values are summarized in Table 19 and 
graphically presented in Figure 25. Peso liteforma t 
Drmesemtcaclion is Wsed in Figure 25 to better distinguish the 


different autocorrelation plots. 
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i; 


TEST DESCRIPTION 


SIGA GROWTH OF 
A’ RECURSIVE BVM 
USING NOISY 
SYSTEM DATA 


SIGA GROWTH OF 
A NONRECURSIVE 
VOL MODEL USING 
wOPST SYSTEX 


DATA 


SIGA GROWTH OF 
A RECURSIVE BVM 
USING OUTPUT DATA 
FROM VOL MODEL 
OF “IEST 2 


DIRECT EVALUATION 
OF EKACT MODEL 
USING NOISY 
SYSTEM DATA 


DIRECT EVALUATION 
OF ‘EXACT MODEL 
USING OUTPUT DATA 
FROM VOL MODEL 
OF “TEST 2 


ADDITIVE NOISE 
SeQUeNee 


INPUT SEQUENCE 


TABLE : 


NORMALIZED AUTOCORRELATION VALUES 


ee?) rics) 


29 ae 


O'S 


Coy f 


» 3736 


O70 


. ons 


OMG 2 


roc) 


-0095 


. 02810 


Ons 


,03u 


.0160 


.0174 


SObcb2 
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.0018 


O20 


7Om2] 


.0106 


«0163 


POMe((0 


2012-56 


r(4) 


OOO 


20075 


.0097 


0243 


.0092 


Se) OSS 


02633 


esp, 


5 Loyal. 


Ore 


~0641 


~O446 


be & 


{0665 


WD 2 


and other random sequences in Experiment 7 


@ Oley) 


Sore 


104 5:1 


105 97 


Oeil 


-0385 


205 le 


.0504 


Autocorrelation values of various error residual 





r(k) 





10 


NORMALIZED AUTOCORRELATION, 
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PeLCUR 25: 
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-_ =. 


RESIOUAL FROM 


, - ' TEST 1 


— ee ee ee ee 


" " RESIDUAL FROM 


610 TEST 4 


NORMALIZED AUTOCORRELATION r(k) 


RESIDUAL FROM 
Test 2 


INPUT SEQUENCE 
fiesta 


NOISE SEQUENCE 
Fisec bie (TEST 6) - 


jE SIOUAL = =AOM ° 


pes s 


=- 1 Qo : _-_ - Some os a 6) aera Gare nerves eien wane) Se 


as me 
Normalized autocorrelation plots for various 


sequences in Uxperiment 7 
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Note the differences between the results of Test 1 and 
Tes; 5 im Table 19 and Figure 25. The residual signal from 
Mest 1 has value r(1) = .2928, indicating that this signal 
ls nonrandom. The residual signal from Test 3 has value 
r(i1) = .0258, and all other autocorrelation values are 
between -24 and +24, indicating that this signal is 
reasonably random. This leads to the conjecture that the 
model of Test 3 is “better” than the model of Test i, 
because the residual sequence from Test 3 is more random. 

This conjecture is further supported by comparing the 
autocorreiation data from Test 4 and Test 5; the exact model 
fit cases. The residual signal from Test 4 has value 
r(i1) = .3756 which indicates that the signal is nonrandom. 
The residual signal from Test 5 is significantly more 
random. The sample autocorrelation data and plots 
corresponding to the additive noise sequence (Test 6), and 
the input sequence (Test 7), provide examples of how the 
autocorrelation values should appear for typical random 
sequences. While some differences can ve recognized in the 
pilots of Fireure 25, We would like to have another criterion 
that could more clearly indicate which sequence is more 
random. 

We developed a measure for the randomness of a sequence 
based on the cumulative distribution of runs of varying 
lengths. <A run of length k is defined as a contiguous 


sequence of k data points with the same sign, bordered by 
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Gaia points with the opposite sign [Ref. 43]. As an 

example, consider the following sequence. 

meemeremaanceid = |-.2,-.5.4+.5.-.1,+.1,+.3,+.05,-.1,--2,--05,-.04} 
items 2 rums of leneih 1, t run of length 2, 1 run of 

imemmeon 5, and 1 run of lemeth 4. 


t 


We define a factor for the "randomness" of a sequence 


{s(n)], to be the percentage of runs with length less than 


or equai a small integer k. 


vy 


Number of runs of length j in the sequence 


C4 
it 
—— 


P(x) = 





Total number of runs in the sequence pee | 


A random sequence should primarily have runs of low 
size. TWere fore Q(t ) Shomid iancre@ase rapidiy for small 
values ofeaxk- Table 20 contains the values of P(x) versus k 
meee tire five error residwal sequences from Test 1 through 
Test 5. We also include the values of Q(x) versus k for 
both the random additive noise sequence {v(n)} and the input 
sequence {u(n)}, as a comparison basis. These are denoted 
as Test 6 and Test 7, respectively. This calculated data is 
Srapnically presemted in Figure 26. A split format 
presentation is used in Figure 26 to better distinguish the 
ite eemenaeawetrroution of runs piots. The abbreviation SIGA 
is used for the Search Indicator Growth Algorithm where 


Space is limited. 


BNE 





1: SIGA GROWTH OF A 
RECURSIVE BVM USING 
ROISY SYSTEM DATA. 


2: SIGA GROWTH OF NON- 
RECURSIVE VOL MODEL 
USING NOISY SYSTEM 
DATA. 


3: SIGA GROWTH OF A 
RECURSIVE BVM USING 
OUTPUT DATA FROM VOL 
MODEL OF TEST 2. 


4: DIRECT EVALUATION 
OF EXACT MODEL USING 
BOISY SYSTEM DATA. 


oo DIRSCT EVALUATION 
OF EXACT MODEL USING 
OUTPUT DATA FROM VOL 
MODEL OF TEST 2. 


6: ADDITIVE NOISE 
SEQUENCE 


i: ISPUT SBQUENCE 


CUMULATIVE DISTRIBUTION OF 


Leor DESCRIPTION pu ) 


~412 


~475 


~474 


TS; 


» 4635 


» 483 


a. 20 


p(2) 


5 oye: 


2711 


Bss0)s 


~715 


oy) 


- 745 


p(3) 


Poot 


aS: 


pos 


=o 6 


7360 


-869 


p(4) 


. 860 


~925 


~904 


2856 


oo 


-928 


~9Fe 


RUNS 


Q(5) 


914 


noo 


-954 


sone 


DS, 


~969 


5S NOS 





Pee 20: Cumulative distribwtion of runs of varying length 


for the error residuals and other sequences in 


Experiment 7 
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Test i: SIGA growth with recursive BVM and noisy system data 
Test 2s SIGH Ereoweh With nonrecursive VOL and noisy system data 
Peco: Oh Gn terowon Wich recursive BYM.and outvout data from 
VOL mo@el of Test 2 
Test 4: Direct evaluation of exact model using noisy system data 
Test 5: Direct evaluation of exact model using output data 
from YOL weodel of Test 2 
Test 6: Additive noise sequence 
Test 7: Input Sequence 
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PICUR® 26: PYoxs of cumulative distribution of runs of 
varying length for the error residuals and 


other sequences in Experiment 7 
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Figure 26 demonstrates the power of the randomness 
factor M(k). The curves corresponding to Test 6 and Test 7 
represent the randomness of the most random sequences in 
this experiment, the additive output noise and the input 
probe. By comparing the cumulative distribution of runs 
curves for different residual sequences, we conjecture that 
thee curve closest to that of Test 6 is the most random 
sequence. Except for pathological cases (e.g. no additive 
noise), it follows that no error residual sequence can be 
more random than our uncorrelated additive output noise. 

The plots of figures like Figure 26 provide an alternate 
means of evaluating the randomness of Sequences. 

Analysis of Figure 26 provides a clear picture of the 
results of Experiment 7. We see that the model of Test 3 is 
Superior to the model of Test 1. Additionally, if we could 
improve our model growth technique and somehow obtain the 
exact form of the model, the N-R technique would provide us 
Wem “the miodel of Test 5. This is significantly superior to 
the model of Test 4, the best we could hope to obtain using 
the existing techniques in the literature. We conclude that 
the N=R technique and the cumulative distribution plots can 
improve our systems characterization methods when we are 


faced with the Case 1A situation. 


om REAL WORMED EXPERIMENTS 
This section presents the results of an experiment using 


real world data Sequences. Unlike the controlled 
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experiments of the previous section, the results are not as 
dramatic. The actual form of the system equation is 
Wanmwowd, as are the specific properties of the input 


sequence and any meaSurement noise. 
Experiment 8 
a ee ae 


he fem London, Connecticut, Laboratory of the Naval 
Underwater Systems Center has been engaged in a eontinuing 
Series of research efforts aimed at accurately modeling a 
particular path in the ocean. One set of experiments 
involved injecting a signal into a transmitting hydrophone, 
and measuring the resulting signal at a distant receiving 
hydrophone. Three sets of these input and output signals 
were sampled, converted to digital format, loaded into 
computer files, and made available for ape ime@nGeenenes We 
denote the different — sequences of length 1024 as 
CHIIN, CH2IN, and CH3IN. The corresponding output sequences 
are denoted as CHIOUT, CH20UT, and CH30UT. 

The sequences were measured over a Suitably short time 
interval, and we therefore consider the acoustic path to be 
time invariant during the period of the measurements. EG ies 
expected tnat ambient noise and Signals from other sources 


are received at the receiving hydrophone. We are 





12 These computer files were made available by Mr. Steve 
Capizzano of NUSC on 12 October 1981. No details were 
available regarding any potential model form or the 
characteristics of the input or noise sequences. 
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therefore faced with a difficult real world example of the 
Case 1A conditions described in Chapter VI. 

We first calculate the sample autocorrelation values for 
the three input sequences. These values are summarized in 


Table 21, and graphically presented in Figure 27. 


NORMALIZED AUTOCORRELATION VALUES 


TERUT SEQUENCE r(1) a es.) r(4) nS) ECG) 


-0313 -.3069 .2131 -.1944 -.2422 .1959 
=e — Os 8636530 =-.1216 -.3309 .1695 


=,0@rrO -.3026 .1913 -.2758 =-.3072 .1630 





TABLE 21: Normalized autocorrelation values for various 


input sequences in Experiment 8 


We also calculate the cumulative distribution of runs 
values for these input sequences. These values are listed 


in Table 22, and graphically presented in Figure 28. 


CUMULATIVE DISTRIBUTION OF RUNS 


INPUT SEQUENCE (1) 2) e863) =— (4) (iy P(6) 





TABLE 22: Cumulative distribution of runs values for 


various input sequences in Experiment 8 
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r(k) 


NORMALIZED AUTOCORRELATION, 


PLOwree 27: 





Mormalized Autocorrelation plots for various input 


signals in Experiment 8 
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CUMULATIVE DISTAIBUTION OF RUNS, 


7. eee 


FIGURE 28: Plot of cumulative distribution of runs for various 


input sequences in Experiment 8 





Twer Gorm of the curves in Figure 27 and Figure 28 
indicate that some problems should be expected. The input 
sequences are not aS random as those of Experiments 1 
through 7. Chapter VI mentioned that this condition may 
give our growth techniques some difficulty. We digress 
momentarily to expand on this point. 

In our computer simulated experimental research of model 
growth, an input sequence {u(n)} based on an independent 
random generator distributed over the amplitude range of 
interest was used. In many real world problems, we are 
given input and output sequences that are simply time-series 
values of the available continuous time signals. Often the 
available input sequence may not be sufficiently random, and 
Significant autocorrelations may result in the error 
residual, even with an adequate model form. Additive output 
noise contributing to the error residual may also result in 
significant autocorrelations. 

Least squares model evaluation does not require any 
assumptions such as independence of the error residual 
values, but our candidate term selection and evaluation 
techniques can give degraded or misleading results in this 
case. The "goodness-of-fit" tests used in Experiment 7 may 
not produce Suitable results in these cases. 

This experiment is continued in an attempt to gain 
hUmponer MMaienc into this common situation, but the 


expectations are limited for a succesSful characterization. 
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Based on the preceding figures, the first 500 data 
Dertes Of CH3SIN and CHSOUT are selected as the input and 
Output sequences for the characterization experiment. We 
first grow a recursive linear ARMA model by the M-Directed 
block form growth technique (with d=1). This is included 
for comparison Stine it is equivalent to the Box and Jenkins 
technique commonly used in the literature. This first 
modeling example is denoted as Test 1 and the resultsS are 
presented in Table 23. 

Chapter VI mentioned that it was possible to evaluate 
models by regression analysis. This involves considering a 
large set of candidate model terms, and evaluating the exact 
PeauecvL1one in the fitting error that results if the best 
performing term is brought into the model, one at a time. 
This is Similar to picking the one candidate model term with 
the largest value of search indicator I(j,8) given by Eq. 
Ponty. “weem@ll from “Tables 5 and 6 that the cost of 
Comeucime I1(3,8) 18S mueh higher than the cost for I(j,12). 
The results of a regression analysis of the experimental 
data using the candidate model term set defined by a 
BVM(1,9) are included for comparison. This is denoted as 
Test 2, and results are presented in Table 24. 

We next use the system data sequences to grow a 
recursive Linear BYM uSing the Search Indicator Growth 
Algorithm. Mos 15 denoted as Test 3, and resultS are 


presented in Table 25. The model from Test 3 performs 
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better at each growth iteration than the model from Test 1. 
Mme réesuits of Test 3 are almost identical to the regression 
analysis results of Test 2 at each growth iteration. peace an 
be shown from the developments in Chapter VI, that the 
Search Indicator Growth Algorithm requires significantly 
fewer computations than regression analysis. 

For Test 4, we use the system data sequences and grow a 
more general recursive nonlinear BVM using the Search 
Indicator Growth Algorithm. This enables us to see if a 
nonlinear model would provide a better fit than the 
beewuoushy analyzed linear ARMA form. The results are 
presented in Table 26. 

Using the VOL model form of Eq. {2.4} and the Search 
Indicator Growth Algorithm, we next grow a nonrecursive 
nonlinear model (first phase of the N=-R technique). We give 
the technique es ae to consider all terms in the VOL(2,9) 
model. This is denoted as Test 5, and results are given in 
Table 27. 

After evaluating the coefficients of the final VOL model 
from Test 5, this model is synthesized on the computer, and 
probed with the stored version of the input sequence f{u(n)}. 
The resulting model output sequence {¥(n)} is stored, and 
used with {u(n)} to grow a recursive model of the BVM form 
using the Search Indicator Growth Algorithm (second phase of 
the N-R technique). This model growth is denoted as Test 6, 


and the results are presented in Table 28. 
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Poe preceding tables and fimures support the previously 
stated concerns regarding the randomness of the input 
sequence {u(n)}. The results of Test 1 through Test 6 are 
all somewhat disappointing. Experience with many controlled 
experiments leads to the conclusion that the correlations 
within the input sequence are the main reason for these 
results in Experiment 8. It should be mentioned, however, 
ae. amother possible contributing problem is that the form 
of the BVM might not be appropriate for the physical system 
we are attempting to represent. 

fie Fesut.ts Obtained in this experiment are the reason 
Mme @aved Ghat systems characterization is a trial and error 
preeess (CCMapter V). The choice of model form and the 
Chargvemeristics of the available (or hopefully controllable) 
input probe are extremely important. These ultimately must 
be selected by the user based on all available quantitative 
and non—-quantitative factors. 

Despite the high fitting error obtained in the various 
tests of Experiment 8, several results are imbedded within 
Tables 23 through 28. The least Squares techniques are 
désvened to Mifnfimize the fitting error nee) while growing 
the model. The performances of Test 1 through Test 4 are 
compared further. Figure 29 is a plot of J(i), the square 
root of the fitting error, versus the total number of model 


terms after each iteration. 
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the total number of model terms after each growth 


iteration, for various tests in Experiment 8 
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ime resmaits of Figure 29 are very Tperiaa ta. The M 
Directed technique (Test 1) reduced the fitting error as the 
order of the ARMA model was increased, but the results are 
Significantly poorer than the regression analysis technique 
(Test 2). The Search Indicator Growth Algorithm using the 
ARMA model form (Test 3) nearly duplicated the performance 
of the regression analysis. We previously showed that the 
Search Indicator Growth Algorithm offered substantial 
computational savings over regression analysis (Chapter VI). 
Figure 29 verifies that even for real world measurement 
sequences, the Search Indicator Growth Algorithm can perform 
systems characterization with results that are equivalent to 
the best existing technique. 

The Search Indicator Growth Algorithm using the BVM(2,9) 
model form (Test 4) provided equivalent or better 
performance than the previous three growth techniques. This 
comparison includes models with the same number of total 
terms. Allowing the algorithm to consider nonlinear terms 
resulted in some of them being chosen over the candidate 
linear terms. 

Even though we have been primarily interested in 
minimizing the fitting error pec. Loe oeinceresting “Co 
look at the maximum and minimum values in the error residual 
sequences resulting from each type of model growth. Figure 
30 graphically presents this information obtained from Table 


23 ubhrovighs Labbe 26. 
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MINIMUM VALUE OF ERMOHR RESIDUAL SEQUENCE 


Plot of the maximum and minimum values of the error 
residual sequence versus the total number of model 
terms after each iteration, for various tests in 


Experiment 8 
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Figure 30 shows that there is a generally decreasing 
trend in the amplitude of both the maximum and minimum 
residual sequences as more model terms are added by the 
growth techniques. Note that the decreasing trend is more 
pronounced in the regression analysis case (Test 2) than the 
Box and Jenkins case (Test 1). The Search Indicator Growth 
Algorithm with ARMA model form (Test 3) performed reasonably 
close to Test 2, and the best performance was obtained when 
the Search Indicator Growth Algorithm was used with the 
BVM(2,9) form (Test 4). The model equation resulting from 
Iteration 8 of Test 4 provided the best combination of low 
Pitving error, good autocorrelation properties of the 
residual, and good cumulative distribution of runs of the 
res®@dual. This 17 term model equation is; 

y(n) = .9345E-1 u(n-5) -.9122E+0 y(n-2) -.8470E+0 y(n-4) 
-.5365E+0 y(n-6) +.1064E+0 y(n-4)y(n-4) 
+.2415E+0 y(n-9)y(n-9) =-.3279E+0 u(n-1) 
=.3585E+0 u(n-5) + .1007E+0 y(n-2)y(n-9) 
-~.3653E+0 u(n-3) =-.1813E+0 y(n=-8) -.1254E+0 y(n-7) 
+.1636E+0 y(n-1)y(n-4) -.2062E+0 y(n-5)y(n-7) 
-.3198E+0 u(n-1)y(n-7) -.3023E+0 u(n=-5)y(n-6) 
+.2965E+0 u(n-6)y(n-1) (eo 

Tames) 27 and 28 indicate that the N=-R technique failed 
to improve the characterization. Several reasons could 
ekaec FOr tnis result. The previously discussed problems 


with the characteristics of the input signal could be a 
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major factor. The normalized autocorrelation values and 
cumulative distribution of runs values in these tables 
indicate that the residuals are probably not random enough. 
ar eacditive ougpwut moise exists im {y(n)}, it may be 
Camrelated with itself or with the input {u(n)}. The only 
cemclusion we can reach on this point is that in this 
experiment, the N-R technique did not offer any advantages 


eo the cheracterization problem. 


ce SUMMARY OF EXPERIMENTAL RESULTS 

The results of these experiments show that block-form 
recursive modeling of nonlinear systems typically produces 
(1) non=parsimonious models (e.g. contain unneeded terms 
with significantly non-zero coefficient values), (2) higher 
SHrcing error J, (3) hi@her condition number for the least 
squares matrix (and therefore larger variance in the 
eoefficient estimates), and (4) distorted coefficient 
estimates on the correct model terms. The block-form 
techniques also require the availability of a larger number 
of data measurements, have a much larger computational cost, 
and oftem fail to converge on an adequate model because of 
excessive ill-conditioning of the least squares matrix or 
the limited amount of available data. 

The block=-form growth techniques make no PEOVisiLOonsS: Lor 
hamid ime near colinearity in the candidate model set. This 
tyeecally resmlts in abnormally high condition number for 


the least squares matrix (and therefore larger variance in 
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the coefficient estimates). Experiment 2 showed the typical 
results of this weakness. 

me bileeck=form techniques force a restrictive set of 
were@s to be fully considered at each growth iteration. This 
typically results in a significant number of unnecessary 
berms in the model equation; inereasing the computational 
eoee amd Contributing to other problems. Experiments 2, 3, 
Sree “4 ame ,examples of this situation. 

The block-form techniques require the availability of a 
larger number of data measurements than the Search Indicator 
Growth Algorithm. Therefore, with limited data, there are 
many cases where we will be unable to grow an adequate model 
hom an @pplica@tion using block-form techniques. Experiment 
5 provided a meaningful example of this situation. 

The Search Indicator Growth Algorithm can better select 
its starting base model by using I(j,12) at iteration 1. In 
this way it can recognize the existence of a delay factor L 
in the system, and can start the growth iteration at the 
appropriate term (Experiment 3). 

Finally, the form of the Search [Indicator Growth 
Algorithm allows for simple extensions that can be used to 
better handle real world conditions like additive output 
noise. The averaging algorithm discussed in Chapter VI 
offers some improvement when conditions permit its uSse 
(Experiment 6). The two-stage "N-R" algorithm proposed in 


Cimeter VI has some capabilities for improving system 
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characterization when we cannot probe the system (Experiment 
7). A real-world example (Experiment 8) shows that the 
Search Indicator Growth Algorithm can characterize a system 
With results that are equivalent to or better than existing 
techmiques. the inability to control the input probe was 
shown to degrade all of these techniques. 

Based on experience with many characterization 
experiments, an important factor appears to be the selected 
amplitude range for the input probe signal. If this range 
is too small, the resultant signals specified by the 
candidate model terms typically are highly colinear. Ehaes 
increases the ill conditioning of the least Squares matrix 
and degrades the performance of the growth algorithm. 

If the input probe is selected to be too large, the 
system output may be unbounded. Bee lsnoe possible or 
meaningful to continue the characterization experiment in 
this case. Experimentation may be necessary to obtain a 
Sumeable input probe. 

The Search Indicator based growth techniques have been 
shown to be superior to block form techniques, but further 
work still remains to be done. The whole systems 


characterization problem is not yet solved. 
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Beli. SRePLACATIONS, CONCLUSIONS, AND AREAS POR URTHER RESEARCH 


ke §6mcDMESUCUSSION OF APPLICATIONS 

A main reason for experimentally modeling a system is to 
better understand the nature of what is actually happening 
in the system. Another reason for modeling is to use models 
i” Gdesiegning controllers or estimators, and for simulating 
systems to predict behavior. The model can serve to confirm 
eeailStime beliefs about functional relationships. A stronger 
concept is that the use of model growing techniques could 
bead to the prediction of a physically significant effect of 
which the application user might be unaware. If new 
information regarding the system can be uncovered, we may be 
better able to understand the inner workings of the system. 

Three current applications for accurate experimentally 
determined models are; (1) fault detection, (2) fault 
evaluation, and (3) reduced-order modeling. We discuss each 
of these in terms of the techniques of this research, and 
describe some new capabilities that appear to be useful. 

Once a model with acceptable performance for a 
particular application has been obtained from the set of 
measurement data, we cannot be sure that there are no other 
equivalently performing models with different sets of model 
térms and coefficient values. Such equivalent models may 


have been uncovered during the model growth iterations. se 
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we somehow obtained equivalent models with different set of 
terms, we must have a means of picking the "best", 

Chapter I mentioned that the criterion for best model is 
application dependent. in wwerms sof performanee modeling’, 
Hoes iollowing criteria appear best. For fault detection 
applications, the optimum criterion is maximum sensitivity 
of the error residual to changes in each system parameter. 
Por “alt evaluation applications, the optimum criterion is 
Maximum distinguishability of the system characteristic that 
has changed. For reduced-order modeling applications, the 
optimum criterion is the best performance of a finite term 
model in duplicating the behavior of the system. 

One general criterion that appears to be a good 
compromise is based on Ockham's Razor, "... the simplest 
model is the best ...". We define the simplest BVM that 
adequately represents the system performance as the one with 
the smallest number of terms, and the lowest degree and 
memory (when the number of terms are the same). This 
Cmieper hones probably not optimum for all applications. 

Assuming that we have reduced our set of equivalent 
models to one "best" model, there still are two main 
concerns. First, we would like to know if any simpler 
equivalently performing model exists. It is conceivable 
that particular systems might be better modeled by a 
Guaperens funetional form than BVM, but we will not consider 


this case. If a model is obtained by one of the block-form 
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techniques of Chapter V, the experiments of Chapter VII 
demonstrate that the model may contain a large number of 
extra terms. If the models were obtained by the Search 
Indicator Growth Algorithm, unneeded model terms may have 
been included at various iterations. We want the most 
parsimonious (minimum number of term) model that matches the 
performance of the system within acceptable error. 

The second concern is how to efficiently use the model 
in applications . If the model is simulated and used as 
Shown in Figure 1 of Chapter I, a running average of the 
Squared difference between the system and model output can 
be monitored. When this average exceeds a threshold, a 
fault may have occurred in the system. Other factors that 
might also cause this condition include; (1) increased 
additive measurement noise, and (2) the current input signal 
exceeding the amplitude range used in this model's growth. 

Once a fault has been detected, the coefficient values 
can be re-estimated and used as an indication of the 
Powmsiblé kind of fault. This last step is the basis of the 
fault evaluation application. The work that follows is 
designed to improve the efficiency and accuracy of the 
Moproaen to both of these modeling concerns. 

A possible concept is to select the subset of model 
terms whose coefficients are most robust to variations in 
conditions that are unrelated to system faults. These 
coefficients are designated as the "syndromes" of the model, 


and the following method is proposed for their selection. 


fel ie 





Step 1: Given the best er ee Pitede model, 
estimate the coefficient values p(i) that minimize 
ter error criterion T°3) for different random input 
probes, each with approximately the same amplitude 
distribution. Mark those coefficients whose 
estimates remain nearly constant from test to test 
as Wavime the property "A". 
Step 2: Using the same model, estimate the 
eoerimictent values p(i) that minimize the error 
CrPuer ion yeas for different ranges of input 
amplitude (continue to use a uniform amplitude 
daeerrnue1on). Limit this range of input amplitude 
to the known or assumed range of the actual 
operating input of the system. Mark those 
coefficients whose estimates remain nearly eae 
Prom test to test as having the property "B"., 
Coemrtctlents with both the "A" and “B" property are 
Comepectcured Co be robust to variations in both the input 
probe level and the particular probe contents. These 
syndromes should therefore be most sensitive to changes in 
the system under test. Ue is vogical to expect that the 
final model obtained by the performance modeling based 
Search Indicator Growth Algorithm of Chapter VI, would 
provide a Superior fault detection signal threshold, and 
better fault evaluation syndromes, than the models obtained 


Dy" tive DwoekK—-form growth algorithms of Chapter V. 
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All of the fault detection and evaluation methods in two 
welal retenenced survey reports {Ref. 44 and 45], and other 
recent papers [Ref. 46, 47 and 48], are based on using the 
Full set of coefficients of the obtained model form. The 
preceding development suggests that the set of syndromes 
would provide a clearer reference for recognizing system 
faults than would the full set of model coefficients. It is 
also expected that these Search Indicator Growth Algorithm 
developed syndromes would be superior to the large set of 
Nonlinear ARMA lattice coefficients proposed for this 
application by Reference 5. 

The reduced-order modeling problem has received 
considerable attention in the literature (Ref. 49 and 50]. 
Tye “concept is to determine the particular finite size 
(number of terms) model that best matches the performance of 
some system. The existing techniques attempt to fit this 
problem to a particular parameter estimation form of 
solution (e.g. recursive-in-order model form and the use of 
bewineon's algorithm). The results of Chapter III and 
Appendix 8B indicate that these methods would generally lead 
to suboptimal models unless the restrictive assumptions are 
met. It makes more sense to grow a model uSing a general 
performance modeling technique like the Search Indicator 
Growth Algorithm, rather than disguising the problem in a 
parameter estimation form. Its also easier to optimize 


the model performance while limiting the number of model 
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terms. The experiments in Chapter VII contain many examples 
showing where the Search Indicator Growth Algorithm produced 
equal size models with vastly superior performance compared 


With recursive-in-order techniques. 


B.  COMCLUSIONS 

[The purpose of this research was to extend existing 
techniques for experimentally developing discrete-time model 
equations to represent the input-output behavior of linear 
and nonlinear systems. 

We started by dividing the problem into four key parts; 
bae Tunmevtional formnm of the model, choice of error 
Minimization method, efficiency of model selection and 
evaluation, and verification of the quality of the model for 
various current applications. After a discussion of 
existing discrete-time model forms, we adopted the more 
general Bivariate Volterra Model (BVM). Various error 
minimization methods were examined, and it was shown how the 
Covariance least squares method 18 generally significantly 
superior to the Autocorrelation method typically used in the 
literature. We next developed expressions for the 
distortions in both the fitting error and the coefficient 
estimates of a linear recursive model form when there exists 
additive output noise. These results clearly show the 
effects of the magnitude of the recursive coefficients and 
the sample autocorrelation values of the noise sequence. 

A general set of recursive solution and evaluation 


equations was developed for computational savings and as a 
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unifying basis for the work that followed. This enabled us 
to easily evaluate a wide range of model equations without 
limiting the form of the model or making other unnecessarily 
restricting assumptions. Existing model growth techniques 
were examined and extended to allow consideration of the 
more general BVM form. Inherent limitations (e.g. maximum 
number of model terms less than or equal to the number of 
available data measurements) were recognized. 

A major goal was the development of a growth technique 
that could perform better than existing techniques. The 
concept of Search Indicators was introduced, and led to the 
development of the Search Indicator Growth Algorithm and 
related special techniques. The physical interpretation and 
Significant computational savings resulting from the use of 
this algorithm were presented, along with provisions for 
handling the important problem of colinearity among the 
model terms. Various conditions affecting the evaluation 
and growth of models were examined, and several special 
techniques were proposed. 

The remainder of the thesis focused on experimental 
verification of the strengths and weaknesses of the various 
model growth techniques. These results clearly showed the 
improved performance of the Search Indicator Growth 
Algorithm. Some specific ideas for improving the 
development and use of mathematical models in several 


current applications were also presented. 
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C. AREAS FOR FURTHER RESEARCH 


Various interesting questions were encountered during 


the development of this thesis. Those listed below can form 


the basis for extension of this work and are recommended as 


areas for further research. 


es 


La) 


The BVM form was emphasized in this research but the 
modeling techniques are not limited to terms that 
only contain integer powers, and products of powers, 
of past and present input values and past output 
values. The model form could be extended to include 
decaying exponentials, divide functions, and other 
factors of the measurements. These could be 
emplicitly included, or we could approximate factors 
like exponentials with difference equations. 

The Candidate Model Specification Technique is a 
fivese Heuristic approach for specifying the 
candidate terms to use in the Search Indicator 
Growth Algorithm. Techniques for decoding the 
patterns in the residuals might lead to improved 
methods. 

The development of the key search indicator I(j,12) 
included the definition of a matrix H(i-1) given by 
Eq. {6.25}. This matrix has several interesting 
properties (e.g. positive semi-definite and 
idempotent), and it might be possible to exploit 


these to produce improved search indicators. 
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The Search Indicator Growth Algorithm performance is 
dependent on the choice of the candidate model terms 
at each iteration, and the values of the two 
heuristic variables hy and ho- These values could 
either be kept fixed throughout the modeling 
iterations, or possibly be made adaptive. 

The Search Indicator Growth Algorithm has been shown 
to be useful for bringing terms into the model 
equation. Since the candidate set is allowed to 
expand after each growth iteration, it follows that 
unneeded terms might exist in the model. Lt might 
be possible to develop efficient search indicators 
that operate on the existing set of model terms and 
Suggest which should be eliminated at each 
rPeeracion. Existing, but expensive, techniques 
include backward regression [Ref. 40]. 

Chapter VI made a case for uniform amplitude 
distribution of the controllable input probe 
sequence, but mentioned that other distributions 
Goud Hrovide better results in certain cases. For 
example, a distribution that put emphasis on higher 
amplitude values might be better suited for 
Pecorgnizing Specific strong nonlinearities in the 
system. 

Additional modeling experience needs to be gained 


With real world experiments. 
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APPENDIX A. GENERAL RECURSIVE SOLUTION 


OF A GROWING SET OF NORMAL EQUATIONS 


The solution of a large set of normal equations occurs 


Pm Yarious frelds, including systems identification, 
prediction, and least squares estimation. 


of normai equations must be solved, 


Linear 
Successive sets 


where the preceding set 


can be related to the subsequent set in the partitioned form 


described below. 


Set 1: A(1)p(1) = 
Set 2: A(2)p(2) = 
where 
At ) 
Ae) oe 
B(2/1) 
and 


Set i: Ges ) = 


where 


ACi-1) 
ie. oe i=” 


h(1) 


n(2) 


' BC 2/1) 


| A(2/1) 


1 n(2/1)? | 


n(i) 


aC / tet) 


Py as ey 


B(i/i-1) |} a(a/i-t) 


and 


| 
“i [n(a-1) pai | 


Fach set is of size c(i), where c(i-1) < c(i). 


CLOTS 


eee 


at 


fa.5} 


{A.6} 


Ne we 





Rather than solving each set independently, the general 
partitioned structure of these related sets of equations can 
be exploited to obtain the solution to each set in an 
efficient recursive manner. From Eq. {4.22}, we use the 
Momation q(i) = c(i) - c(i-1). 

Given a set of c(i) linear normal equation; 


ACi)p(i) 


h(i) {a.8} 
Where we have a unique solution to the previous set of 
e(i-1) normal <a 

p(i-t) = a(i- bh h(i-1) {aA.9} 
Substituting {A.6} and {A.7} into {A.8} yields; 
ACi-1) ;B(1/i-1) h(i-1) 
B(i/4-) 'a(a/4-1) h(i/i-1) {a.10} 
The partitioned matrix inversion theorem [Ref. Ine eke 18 | 
permits exploitation of the symmetry of matrix A(i) 
whenever fa(i/i-1)] a 0. 
Pecite + (3)0(4) PRC)" Er(a)e (ay n(i-1) | 


am awe se eo eee ee S| ee ee ee ee ee Se ee ee 


g(a) F(a)? 


ei) eres Pee a ae} 


mmerme FG) ia a c(i-1) x q(i) matrix, and G(i) is a 


a(i) x q(i) matrix each defined below. 


Bi). @> we) halcu ee iA te 
Cli) = #(i/i-1) - B(i/i-t) A(a-1) > B(i/i-1) 
= A($/i-1) + B(i/i-1)) F(i) {a.13} 


Zxpanding &q. (Ai | produces; 
CAT MG) + (2) | PU ant)» BOC) att) 


(i) F(i) h(i-1) ae (ca) = re se) 
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Define two vectors of size q(i); 
mee = = hi i-1) + (i) n(i-1) {A.15} 
k(i) = aay” ea {aA.16} 
Substituting {[A.15} and {4.16} into {a.14} 
mii-1) + F(i)e(i) 2(1) 


G(i)7) g(i) 


mist) + (i)k(1) 


Gt) ea 
p(i-1) F(i) 
2 ir ‘ai epee (i ) 
0 I = eer 


Squation fa.18} is our desired answer and is presented 
oo Chapter IY as Eq. (AS: 

A compact recursive expression for the resulting minimum 
average sum squared error, ele can also be developed. 
Substituting | died 1mGo {4.3} for ne Cee model produces; 


A (4) 


a = SCD RGOT BCH) 


= 1 yly - r(i)9(i) fa.19] 


N 
Usamue the definitions of ka. ee) through 2 be the 
vectors r(i) and 9(i) can be rearranged in the form of 


¥ec vers biG) and ee) respectively. Substituting these 


ct 
ty 


GO Gk fa.19} produces; 
Z 


3°(4) = 4 yTy - nla)’ pCi) f4.20} 
N 
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Substituting {A.7} and {A.18} into {A.20} and simplifying 
yields; 


Ay 
Pes igen Pati) Y acayety) T 303) 


a 
N 


= 1 yTy - n(i-t) p(i-t) - n(a-1) F(i)(4) - hy =) ce 
N 


Tce) — | ee) FO) * ul Pea) ] k(i) 


it 


feet) = welt) Ki) fa.21} 


where we made use of {A.15}, {A.16}, and ioe) the 
st 
previous evaluation of the (i-1) model. This last 


expression appears in Chapter IV as Kq. {4.36}. 
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APPENDIX B. RELATIONSHIP OF THE GENERAL RECURSIVE ALGORITHM 


TO LEVINSON'S ALGORITHM 


When we are given a set of simultaneous linear 
equations, 
Riee@ipti} = ni) slo | 
where ACi) is a c(i) by c(i) real matrix, the direct 
; 3 
solution of Eq. {B.1} requires on the order of [c(i)] /3 
multiplicative operations. When A(i) is symmetric and 
positive definite, the solution can be accomplished with the 
3 
order of [c(i)] /6 multiplicative operations by various 
techniques (e.g. Cholesky, LU decomposition, etc.) . 
Appendix A developed a general recursive solution for 
p(i), based on the concept that the previous set of 
e(i-1) < eli) equations given by; 
ACi-1)p(i-1) = hn(i-1) {a.9} 
#1 
has been previously evaluated for p(i) and A(i-1) , where 


the following partitioned matrx relationship exist. 


ioe a)" Sone Cl UE 
wei/i-1) | a(a/t-1) {A.6} 


T 
[ncs-1)" cen)" Ev 1) 


This general soiution equation for ye) is repeated below; 


and 


n(i)” 


|=” 


(2-1) F(i) 
pi —-~— + peeeceanm | k(t ) {a.18} 
O I 


whtere O is the nwii vector, Pens tthe aden tity matrix, and; 
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P(i) = re) a B(i/i-1) {A.12} 
Gey = A(#/i-1) - EVE MENC=1) BCisfz-1) 

= A(Ci/i-1) + Bey ter (i) eae, 
ye) Ct h(i/i-1) - B(i/i-1) p(i-1) 

. h(i/i-1) + F(i) h(i-1) {a.15} 
Mei) = c(i) 3) {a.16} 


The matrix F(i) given by Eq. Veet 2: requires the use of 
the inverse of matrix A(i-1). If this was not explicitly 
soived for previously, it is needed at this point. The 
vector BU) Ziven by Eq. {A.16} requires the use of the 
inverse of matrix G(i). 

In 1947, Norman Levinson pubiished a paper in which he 
"@n order to facilitate computational Bee duis worked out 
an approximate, and one might say, mathematicaily trivial 
procedure” [Ref. 1; pp- 161]. Levinson was working in 
Comjunction With Aorbert Wiener on a probiem involving 
linear moving average filter design using a least squares 
fit. This required solving a set of simultaneous linear 
Sam@ations of the form of Eq. Dee Levinson developed an 
iterative procedure for obtaining p(i) based on the 
mol lomwe me conditions (given in terms of our notation). 

(1) Matrix A(i) is symmetric, and can be 
represented in the form of kq. {A.6}. Bet 
(2) The solution for p(i-1) has been previously 
obtained. {B.3} 


(3) a(i/i-1) is a 1x 1 matrix (scalar), a(i/i-1) {B.4] 
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Later, Durbin [Ref. 2] simplified the solution for p(i) by 
adding a fourth condition. 
(4) BCi/i-1) reduces to a vector b(i/i-1) which 
equals the vector h(i-i) in reverse order. {B.5} 
Despite this separate and later simplification by Durbin, 
the procedure is commonly referred to as Levinson's 


algorithm, and it can be shown that; 


p(i-1) ey, 
pli) = Je----- : mem k(i) {B.6} 
O 1 


The main property of this algorithm is that calculation of 
the vector f(i) and the scalar k(i) do not require any 
Matrix inversions. Additionally, the number of 
multiplicative operations required for the solution of kq. 
{p.6} is 2c(i-1)+*1, where c(i)-1 = ec(i-1), and e(i-1) is the 
size of matrix A(i-1!). The popularity of Levinson's 
Sel@orithm@ is a result of this very small computational cost, 
which eer, B s@eHificant savings over the direct matrix 
sommutiom of Eq. tery by the more conventional techniques. 

We demonstrate that {p.6} is a special case of fa.18} 
Degied om the four conditions (Beet. {B.3}, {p.4}, and {p.5} 
presented above. We first develop an expression for the 
factor Ci ) ine oo {g.6}. From {3.5} we can write 

BC i/i-1) = h(i-1) in reverse order = Tie) ou 


where the "e,’ denotes an end-for-end reversal. 
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Substituting Bie! inte ea | Pred uc e's; 


F(i) 


= rules BC se) 


= eg) + eet =4)) aoe 


From the symmetric condition { Beet Eq. {a.9} can bemwritten 


A(i-1) B(i-1) = ‘h(i-1) {B.9} 


Substituting {B.9} into {3.8} and simplifying yields; 


F(i) = -A(ae1) a t=) 


S CRS )p(i-1) 


x ~p(i-1) Baton 


Fron {B.10}, we see that F(i) is obtained directly from the 


previous solution p(i-1) without any computational cost. 


We now develop an equation for k(i) in Eq. {B.6}. 


substitute Bed}, Pcie Oe and {B.10} into ues: and solve 


tor tne 


@(i) 


ie) Were €(1). 
ee Mhi/i-1) + B(i/i-t) F(3) 


= ge(i/i-1) - Wat)? Bi-1) (Beale 


The solution of Eq. {B.11} requires c(i-1) multiplications. 


suboetituting {B.10} imtoo ih o5) and simplifying yields; 


g(i) 


h(i/2-1).* F(i) n(i-1) 


h(i/i-1) - Fae Bate} 


The solution of Eq. {B.12} requires c(i-1) multiplications. 


Next 


ge 


substitute {3.11} and {3.12} into {A.16}. 


ae) aes) 

ed ah Ad =u wt Tw 
[ a(ifi-1) - n(i-1) $(i-1) J] [ n(a/i-t) - B(i-1) n(i-1) | 
Ri), a s@alar VBE} 
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Therefore k(i) requires 2c(i-1)+1 multiplications, and does 
not involve any matrix inversion, other than the trivial 
Seater inversion of the 1 x 1 matrix, G(i). 

We see that under the four stated conditions, the 
general recursive solution algorithm {a.10} reduces to 
Levinson's algorithm, and has the same COmMpuGatronwas cos t. 

The simplification of Levinson's algorithm is critically 
dependent upon conditions {B.4} and {B.5}. Ties ot irs ¢t 
condition, that A(i/i-1) is a 1x 1 matrix, restricts 
Levinson's algorithm to be a single-step iterative technique 
(one increase in size of matrix A(i) over A(i-1)). This 
limits model growth to only one new term at a time in the 
general modeling problem. The second critical condition, 
that the transpose of B(i/i-1) equals h(i-1) a reverse 
order, restricts Levinson's algorithm to both the limited 
cases of model growth that adds terms that are delayed 
versions of existing terms, and the use of the 
Autocorrelation error minimization method that produces 
le@st “squares normal equations with this special structure. 
Bote thet ’@etriz A(j) for j=1,2,.--,i has to be Toeplitz, 
and the vector h(j) must satisfy condition [{B.5}. 

Multiple channel versions of Levinson'’s Algorithm have 
been proposed [Ref. = 9], but these all require the 
special “recursive-in-order’ relationship between the model 
terms represented in A(i-1), and those terms represented in 
A(i). This is an unnecessary and suboptimal restriction for 


the general model growth problem. 
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APPENDIX C. DETAILS OF ORDER OF COMPLEXITY CALCULATIONS 


USED TO COMPARE THE GROWTH TECHNIQUES 


Using the same convention as Chapter VI, we denote the 
size c(i-1) of the Cb aa model as P, and the number of 
data points in the error minimization as N. When 
distinction is needed, we use the shorthand "Step Di", "Step 
Bi", or "Step Si" to indicate that we are refering to Step 1 
oe them direct least squares technique, block-form recursive 
teehmique, or search indicator growth technique, 
respectively. When the computational cost is the same for 
all three techniques (e.g. steps 1 through 7), we use just 
the step number. 

The computational cost for each of the first seven steps 
is developed as follows. Sach of the three growth 
techniques uses the identical first seven steps. 

Step 1: Set i = 1, and form the term vector x(n,i). No cost. 
Seep 2: Fomme (1) weing Eq. {4.5} 


R(4) = 4 %(i)'x(4) [4.5] 


Since X(i) is a N x c(i) matrix, each element in the 
c(i) x c(i) matrix R(i) requires N multiplications 
and 1 division operation. Because of symmetry, 
there are c(i)l[ec(i)+1]/2 elements in matrix R(i) 
that must be calculated. Therefore the computational 


eee is [+1 |c(i)[ c(i)+1]/2. 
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sep 3: Form Ei) using Kq. a. 6 } 


ea) ani (i) y {4.6} 


4 


Each of the c(i) elements in vector r(i) requires N 


multiplications and 1 division, therefore the 


computational cost is ec(i)[{N+1 ]. 


Step 4: Invert R(1i) 


Since R(i) is a symmetric matrix of size c(i), it 


can be inverted at a cost of [c(i)**3]/6 operations. 


Step 5: Solve for y*(4) using Eq. ees 


Z 
Gy) a ay 


~ 


. eGo aie aes 4.3} 


eeeces se 13 8 size NW vector, the first term on the 


Tagpt side of {4.3} can be computed with Ntt 


operations. 


Using the preceding definitions for the 


sizes of vector r(i) and matrix R(i), the second 


term on the right side of {4.3} requires 


[e(i)**2] + c(i) operations. The total cost for 


this “Step is 

mp 6 «amd Step 7 do 
Adding costs results 
mor the direct Least 
Cee = [ei era /6 + 
We continue with 


form tcectinique. 


therefore [c(i)**2] + c(i) +N +1. 
not involve any computational cost. 
in the following complexity equation 


squares technique. 
ean | /2 4 c(i)(3N+5]/2 + N+i (0.9) 


the computational cost of the biock- 
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step B8: Form A(i/i-1) using Eq. {4.27} 
Wig 1-1 eee. W(i/i-1)) w(a/i-t) Peery 
N 


The N x q(i) matrix W(i/i-1) is the data matrix for 
the new model terms. Each element in matrix 
A(i/i-1) requires N multiplications and 1 division 
Operation. Because of symmetry, there are 
a(i)lq(i)+1]/2 elements that must be calculated, 
therefore the computational cost is 
[wet Ja(i)[a(a)+t ]/2. 

Step B9: Form B(i/i-1) using Eq. | desig. 


Bie 2= 1) oe wi-1)  WO4/i-1) Ae BB | 
N 


The N x P matrix W(i-1) is the data matrix for the 
Gc model obtained previously. Each element in 
matrix B(i/i-1) requires N multiplications and 1 
division operation. Since there are q(i)P elements 
that must be calculated, the computational cost is 
[N+1]q(i)P. 

tep B10: Form h(i/i-1) using Eq. {4.29} 


ma / i= i) 1 W(i/i-1) y {4.29} 
N 


Using che preceding definitions of the sizes of 
matrix W(i/i-1) and vector y, each of the q(i) 
elements in vector h(i/i-1) requires N 
Multiplications and 1 division operation. Therefore 


the computational cost is a(i){net]. 
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temp Bii: Form F(i) using Eq. {4.30} 

F(i) = i Ba /ia4)) {4.30] 
The P x P matrix aed) is the inverse of the 
least squares matrix for the Cape model obtained 
Pee@eousiy. Since ipeltrix B(i/i-1) is P x q(i), each 
element in matrix F(i) requires P multiplications. 
There are q(i)P elements in matrix F(i) and 
therefore the computational cost is q(i)[Pp**2]. 

tep B12: Form G(i) using Eq. {4.31} 

ny Cy a= ) + B(i/i-1) F(i) (oo, 
Using the preceding definitions of the sizes for 
matrices B(i/i-1) and F(i), each of the [q(i)**2] 
elements in the result of the second term on the 
rig“latwside of {4.31} Peaguires Pemultiplicatkons. 
Since there are [q(i)**2 | elements in this resulting 
Piast, phe total computational cost is P[q(i)**2]. 

Step Bi3: Form g(i) using Eq. {4.32} 

Seana eer Gay n=) [4.32] 
Using the preceding definitions of the sizes for 
matrix F(i) and vector h(i/i-1), each of the q(i) 
elements in the result of the second term on the 
right side of {4.32} requires P multiplications. 
Therefore, the total computational cost is Pau). 

tep Bia: Invert G(i) 
Since G(i) is a symmetric matrix of size q(i), it 


can be inverted at a cost of [q(i)**3 ]/6 operations. 


Zo 





Step B15: Form k(i) using Eq. {4.33} 
Gi) (1) 2(1) TR in 
Since the inverse matrix e(if? 1 Soraaizemag( i) xi qi), 
and g(i) is a q(i) size vector, each of the q(i) 
elements of vector k(i) requires q(i) 
Multiplications. Therefore, the computational cost 
is [q(i)**2]. 
Sep BN6e Sole for Ter) using Eq. {4.36} 
7 Co ee - g(i) k(i) {4.36} 
Since g(i) and k(i) are both size q(i) column 
vectors, the computational cost is q(i). 
Step Bi7 does not involve any computational cost. 

Based on the results of step Bi7, the growth may stop. 
Adding complexity notational for each step, results in the 
Ffoitlowing complexity equation for the block-form technique 
(etepe 388 threweh B17). 

On) = [eras] /o + [p+m+3)(q(i)ere)/2 
+ q(i)[wP+[P**2]+2P+(3n+5]/2 | os 2) 
if a@@ditional growth iterations are required for 
adequate modeling verformance, one additional computational 
step is required before starting again at step B77. 
Step B18: Form inverse of A(i) using Eq. AST 


: 7 
-1 on en a CD) aan), | B(i)a(i) 
Ai) = ae aaeecocee =e 


=|) 1 
G(i) F(i) G(i) {4.37} 
All of the indicated matrices in {4.37} have already 


been eca@iculated. The only computations required are 


Zoe 





those necessary to form the matrix factors 

= T -] 
FCi)G(i) FCi) and F(i)G(i) . Using the 
previously defined sizes of these matrices, these 
Piewors cam boOch be calculated “With a total cost of 
a(i){p**2] + Plq(i)**2]. 

We continue with the computational costs of the search 
indicator growth technique. The cost of step 1 through step 
7 is the same as first two growth techniques. 
eee SS: Pormel(j,i2) for terms in w(n,i/i-i) using 

the definition of Eq. foreol. repeated here; 
Z 
ay 
f# w (i/i-1) e(i-1) 
v4 7 
10 Oe ae 


Ik 
w:(i/i-1) w.(i/i-1) 
4; i Ae ny Eoe ee 


=\— 


Using the computational results of Table 5 and Table 
6, the cost of I(j,12) for the first term is NP+2N+4, 
ema sem coat for each of the second through the 
a term is 2N+4. Therefore, the total cost for 
the q(i) indicators I(j,12) is = PN + [2N+4]q(i). 
tep S9 involves selecting the subset of terms with 
values of I(j,12) greater than a specified level hy: 
Depencing on the value of hy anceeonoey aiues ot 1(),12)5 
there will be an integer k number of terms left. The size 
of the term vector w(n,i/i-1i) is reduced from q(i) x 1 to 


(oe Where 21S no significant cost of this step. 








Step S10: Form A(i/i-1) using the reduced vector 
w(n,i/i-1) in Eq. {4.23} and substituting 


this into Eq. {4.27} to form; 


A(i/i-1) = 1 W(i/det) W(d/iet) eenese 


N 

The N x k matrix W(i/i-1) is now the reduced data 
matrix for the new model terms. Each element in 
matrix A(i/i-1) requires N multiplications and 1 
division operation. Because of symmetry, there are 
k[k+1]/2 elements that must be calculated, therefore 
the computational cost is [wet |k[ +1 ]/2. 

Step S11: Porm B(i/i-1) using Eq. {4.26} 


B(i/i-1) = 1 Win) W(4/i-1) {4.26} 
N 


The N x P matrix W(i-1) is the data matrix for the 
Ce model obtained previously. Sach element in 
matrix B(i/i-1) requires N multiplications and 1 
division operation. Since there are kP elements 
mae Wess «be «Calculated, the computational cost is 
[u+1]kP. 

Step S12: Form h(i/i-1) using Eq. {4.29} 


eet) = I. W(i/i-t)y {4.29} 
N 


Using the preceding definitions of the sizes of 
matrix W(i/i-1) and vector y, each of the k elements 
in vector h(i/i-i) requires N multiplications and 1 


Givisaen operation. Total cost is k[ +1]. 
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tep $13: Form F(i) using Eq. {4.30} 
men > 1-80 1-1 YG i-1) {4.30} 
The P x P inverse matrix NC ee ee is the least 
squares matrix for the ane. model obtained 
previously. Since meee B(i/i-1) is P x k, each 
element in matrix F(i) requires P multiplications. 
There are kP elements in matrix F(i) and therefore 
the computational cost is k[ P**2]. 
Step Si4: Form G(i) using Eq. {4.31} 
(4) = aA(i/i-1) + B(i/i-1)"R(1) {4.31} 
Usimg the preceding definitions for the sizes of 
matrices B(i/i-1) and F(i), each element in the 
result of the second term on the right side of 
{4.31} requires P multiplications. Since there are 
[x**2 | elements in this resulting matrix, the total 
computational eo is P[k**2]. 
Step S15: Form g(i) using Eq. {4.32} 
mae ee meiaey) + 701) nCi-1) {4.32} 
Using the preceding definitions of the sizes for 
matrix F(i) and vector h(i/i-1), each of the k 
elements in the result of the second term on the 
rim@m>o side of Qiges 2 } requires P multiplications. 
Mmeresosre, the total computational cost is Pk. 
tep S16: Invert G(i) 
Since G(i) is a symmetric matrix of size k, it can 


be inverted at a cost of |k**3]/6 operations. 


5) B, 





Stae S17: Form vector Kite usage Eq . 4.33} 
Per Sry es | (4. 33] 
Since the inverse matrix Gti) is size k x k, and 
g(i) is a k size vector, each of the k elements of 
vector k(i) requires k multiplications. Therefore, 
the computational cost is [k*##2]. 
Sagep 3510; Sormbve for foway UsSttigewear. 145 36} 
eo (i-1) = (i) K(a) (4. 36} 
Sace got) and k(i) are both size k column vectors, 
Gime COmpuCaLioOnal cost is k multiplications. 
Step $19 does not involve any computational cost. 

Perea on sene results of step S19, the growth may stop. 
AGding complexity notation for each step, results in the 
following complexity equation for the block-form technique 
(siems SO through S19). 

& 
O(n) = (k**3)/6 + (P+N+3)(k*#*2)/2 + kU NP+(P**2)+2P4+( 3N+5)]/2] 
+ NP + ([2N+4]q(i) ex 3h 
ff additional growth iterations are required for 
adequate modeling performance, one additional computational 
Seep ise required before starting again at step 387. 


Seepealo. “Porm inverse of A(i) using Eq. {4.37} 


ee | eee wt 
a A(i-1) ce) Cea) ae) EE? 
CG eas ai) Tce (4.37) 
All of the indicated matrices in {4.37} have already 
been calculated. RiemonlyscOmpuLaLlons required are 


those necessary to form the matrix factors 
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-l if -l 
MemeGctg 8 F(i) and F(1)G(i) . USing the 
previously defined sizes of these matrices, these 
factors can both be calculated with a total cost of 


Pel Peea>] + PL k**2). 
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